M.Sc. Thesis: David Sharma on Adaptive Training Strategies for Brain-Computer-Interfaces
Supervisors: Elmar Rueckert, Prof. Dr. Jan Peters, Dr. Ing. Moritz Grosse-Wentrup
A problem of todays brain computer interface (BCI) systems is that performance in controlling a BCI can decrease rapidly over time. This is due to the non-stationarity of recorded electroencephalography (EEG) signals. Furthermore, the motivation of the subject can drop if the subject does not experience any success in controlling a BCI. A possible solution to these problems is to provide the subject with continuous feedback and to train a reinforcement learning (RL) agent on the task in order to support the subject in solving that task. A selection policy (implemented through a Monte-Carlo sampling process) selects either the command generated by the subject or by the RL agent. Especially in the beginning, the RL agent controls the actions of the task most of the time. As the experiment proceeds, the impact of the agent decreases and the subject gets more own control over the actions. The subject is not aware of the RL agent. To measure the performance of subjects, we implemented a scoring system, which rewards (positive or negative) the subject for its current performance, i.e., how good the subject solves the task.
We implemented a game, where the subject needs to control a game figure with the imagination of limb movements to jump over approaching obstacles. In our experiments, we collected data from 20 subjects. The evaluation of the gathered results, show a positive trend that subjects which trained with the reinforcement learning agent have a higher performance than subjects that did not train with the reinforcement learning agent. We also wanted to test if the subjects were able to adapt to new environments after the training. We first trained a classifier on the data from the training phase and used this classifier to decode new incoming EEG signals. We confronted the subjects to new obstacles. Unfortunately, performance of the subjects and the classifier were bad, such that we could not verify that the subjects were able to adapt to new environments.