The effect of stress on the explore-exploit dilemma




Ferguson, Thomas

Journal Title

Journal ISSN

Volume Title



When humans are faced with multiple options, they must decide whether to choose a novel or less certain option (explore) or stick with what they know (exploit). Exploration is a fundamental cognitive process. Importantly, when humans attempt to solve the explore-exploit dilemma, they must effectively incorporate both feedback and uncertainty to guide their actions. While prior work has shown that both acute (short-term) and chronic (long-term) stress can disrupt how humans solve the explore-exploit dilemma, the mechanisms of how this occurs are unclear. For example, does stress disrupt how people integrate feedback to guide their decisions to explore or exploit, or does stress disrupt computations of uncertainty regarding their choices? Importantly, the use of electroencephalography as a tool can help reveal the impact of stress on explore-exploit decision making by measuring neural signals sensitive to feedback learning and uncertainty. In the present dissertation, I provide evidence from a series of experiments where I examined the impact of both acute and chronic stress on the explore-exploit dilemma while electroencephalographic data was collected. In experiment 1, I exposed participants to an acute stressor and then examined their decisions to switch or stay – as a proxy for explore and exploit decisions – in a multi-arm bandit paradigm. I found tentative evidence that the acute stress response disrupted both the feedback learning signal (the reward positivity) and the uncertainty signal (the switch P300). In experiment 2 I adopted a computational neuroscience approach and directly classified participants decisions as explorations or exploitations using reinforcement learning models. There was only an effect of the acute stress response on feedback signals, in this case, the feedback P300. In experiments 1 and 2, I used contextual bandit tasks where the reward probabilities of the options shifted throughout, and there was no behavioural effect of acute stress on task performance or exploration rate. However, in experiment 3, I examined a learnable bandit where one option was preferred. Again, using computational modelling and electroencephalography, I found tentative evidence that the acute stress response disrupted the feedback learning signals (the feedback P300) and stronger evidence that acute stress disrupted the uncertainty signal (the exploration P300). As well, I observed that the acute stress response reduced task performance and increased exploration rate. Lastly, in experiment 4, I examined the impact of chronic stress exposure on explore-exploit decision making and electrophysiology – while I found no effects of chronic stress, I believe future research is necessary. Taken together, these findings provide novel evidence for the neural mechanisms of how the acute stress response impacts the explore-exploit dilemma through disruptions to feedback learning and assessments of uncertainty. These findings also highlight how theories of the P300 signal may not be properly capturing the varied role of the P300 in cognition.



Explore-Exploit, Acute Stress, Chronic Stress, Reinforcement Learning, Decision Making, Bandit Task