About Me

I research various topics, mostly in the field of reinforcement learning. I am currently a Ph.D. student in the Reinforcement Learning and Artificial Intelligence lab, part of the Alberta Machine Intelligence Institute and the Department of Computing Science at the University of Alberta. My supervisor is Professor Rich Sutton.

Research Interests

My current research interests include off-policy learning, policy gradient algorithms, and fundamental reinforcement learning algorithms.

The underlying motivation for my research can be summarized by the following question: Given past experiences, how can we make better decisions for the future? The answers to this question are as relevant to the construction of artificially-intelligent agents as they are to the choices we make in our daily lives, and how we choose to structure the society we live in. For this reason I’m interested in both the underlying theory as well as applications of reinforcement learning.

One approach to answering the above question is policy gradient algorithms, which adjust their policy to more often choose actions that improve long-term performance. As an active area of research, there are many open questions and interesting challenges to applying policy gradient algorithms on real-world problems.

Off-policy learning is crucial for most applications because it allows an agent to learn from the experiences of others, such as human experts. This is important because decisions made in the early stages of trial-and-error learning are often suboptimal; I’d rather walk than get a ride from a self-driving car during it’s first attempts to drive. If instead the self-driving car had achieved superhuman performance by learning (off-policy) from experiencing billions of hours of people driving, I would gladly accept a ride.

While applications of reinforcement learning hold the potential to improve quality of life for everyone, they also highlight areas where more theoretical progress needs to be made. Combining reinforcement learning with powerful function approximators like neural networks has resulted in both impressive successes and surprising failures. These failures suggest ways that fundamental reinforcement learning algorithms could be improved to expand the range of possible applications and the positive impact they can have on the world.