Matthew Riemer

I work as a researcher in the AI department at the IBM T.J. Watson Research Center. I am also a final year PhD student at Mila, Quebec AI Institute affiliated with Université de Montréal where I am supervised by Irina Rish.

I am broadly interested in the question of how to build agents that learn efficiently over non-i.i.d. or non-stationary data distributions. This topic is often referred to as lifelong or continual learning. Over the course of my career, this has included research on many different aspects of this problem including: the role that meta-learning and modular architectures can have in finding a better balance between stability and plasticity, the formulation of the problem and correct objective for optimization, adapting in multi-agent environments where agents can change their policies, and learning adaptive policies with hierarchical reinforcement learning. If you are interested in working with me on related topics, please reach out by email!

Email  /  Google Scholar  /  Github  /  LinkedIn  

profile photo

Research Directions

My primary ongoing research directions include:

Identifying Challenges in Optimizing the Stability-Plasticiy Dilemma: A key direction of my research has been advocating for formulating optimization of the stability-plasticity dilemma within the framework of reinforcement learning in continuing environments. Indeed, the key question we must consider to solve this dilemma is to what degree we expect certain information to be relevant in the long-term future. However, existing approaches are not able to learn efficiently in the presence of high mixing times, which we have shown to be pervasive for continual learning problems.

Learning to Continually Learn: My most influential work to date has been the invention of the "meta-experience replay" technique, which combines experience replay with a meta-learning process in which the agent learns to align gradients across continual updates to improve both stability and plasticity when training neural networks. Our recent study further demonstrates that this technique still remains a state-of-the-art approach today that improves on the performance of experience replay for continual pre-training of LLMs while incurring negligible compute overhead.

Dynamic Multi-Agent Interaction: The truth is that the world is largely stationary in the sense that the laws of physics are stationary. When we say that the world is changing, we are generally referring to changes in behavior among agents. As such, I view the study of how to adapt to other agents as they learn as a key aspect of the continual learning problem.

Recent Research Papers

Here is a collection of previous published research papers I have collaborated on since 2024:

rtb Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models

Istabrak Abbes*, Gopeshh Subbaraj*, Matthew Riemer, Nizar Islah, Benjamin Therien, Tsuguchika Tabaru, Hiroaki Kingetsu, Sarath Chandar, and Irina Rish
Conference on Lifelong Learning Agents (CoLLAs) 2025

Paper | Code

rtb Finding the FrameStack: Learning What to Remember for Non-Markovian Reinforcement Learning

Geraud Nangue Tasse, Matthew Riemer, Benjamin Rosman, and Tim Klinger
Reinforcement Learning Conference (RLC) 2025 Finding the Frame Workshop

Paper

rtb Combining Domain and Alignment Vectors Provides Better Knowledge-Safety Trade-offs in LLMs

Megh Thakkar, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, and Sarath Chandar
Association for Computational Linguistics (ACL) 2025

Paper

rtb EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts

Subhajit Chaudhury, Payel Das, Sarath Swaminathan, Georgios Kollias, Elliot Nelson, Khushbu Pahwa, Tejaswini Pedapati, Igor Melnyk, and Matthew Riemer
Association for Computational Linguistics (ACL) 2025

Paper

rtb Position: Theory of Mind Benchmarks are Broken for Large Language Models

Matthew Riemer, Zahra Ashktorab, Djallel Bouneffouf, Payel Das, Miao Liu, Justin D Weisz, and Murray Campbell
International Conference on Machine Learning (ICML) 2025

Paper

rtb Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference

Matthew Riemer*, Gopeshh Subbaraj*, Glen Berseth, and Irina Rish
International Conference on Learning Representations (ICLR) 2025

Paper | Code | Blog

rtb Handling Delay in Realtime Reinforcement Learning

Ivan Anokhin, Rishav Rishav, Matthew Riemer, Stephen Chung, Irina Rish, and Samira Ebrahimi Kahou
International Conference on Learning Representations (ICLR) 2025

Paper | Code | Blog

rtb Contextual Value Alignment

Pierre Dognin, Jesus Rios, Ronny Luss, Prasanna Sattigeri, Miao Liu, Inkit Padhi, Matthew Riemer, Manish Nagireddy, Kush Varshney, Djallel Bouneffouf
International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025

Paper

rtb Balancing Context Length and Mixing Times for Reinforcement Learning at Scale

Matthew Riemer, Khimya Khetarpal, Janarthanan Rajendran, and Sarath Chandar
Conference on Neural Information Processing Systems (NeurIPS) 2024

Paper | Code

rtb A Deep Dive into the Trade-Offs of Parameter Efficient Preference Alignment Techniques

Megh Thakkar, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, and Sarath Chandar
Association for Computational Linguistics (ACL) 2024

Paper

rtb ComVas: Contextual Moral Values Alignment System

Inkit Padhi, Pierre Dognin, Jesus Rios, Ronny Luss, Swapnaja Achintalwar, Matthew Riemer, Miao Liu, Prasanna Sattigeri, Manish Nagireddy, Kush Varshney, and Djallel Bouneffouf
International Joint Conference on Artificial Intelligence (IJCAI) 2024

Paper

rtb Realtime Reinforcement Learning: Towards Rapid Asynchronous Deployment of Large Models

Matthew Riemer*, Gopeshh Subbaraj*, Glen Berseth, and Irina Rish
Reinforcement Learning Conference (RLC) 2024 Finding the Frame Workshop

Paper

rtb Scalable Approaches for a Theory of Many Minds

Maximilian Puelma Touzel, Amin Memarian, Matthew Riemer, Andrei Mircea, Andrew Williams, Elin Ahlstrand, Lucas Lehnert, Rupali Bhati, Guillaume Dumas, and Irina Rish
Internation Conference on Machine Learning (ICML) 2024 Agentic Markets Workshop

Paper