Matthew Riemer

I work as a researcher in the AI department at the IBM T.J. Watson Research Center. I am also a final year PhD student at Mila, Quebec AI Institute affiliated with Université de Montréal where I am supervised by Irina Rish.

I am broadly interested in the question of how to build agents that learn efficiently over non-i.i.d. or non-stationary data distributions. This topic is often referred to as lifelong or continual learning. Over the course of my career, this has included research on many different aspects of this problem including: the role that meta-learning and modular architectures can have in finding a better balance between stability and plasticity, the formulation of the problem and correct objective for optimization, adapting in multi-agent environments where agents can change their policies, and learning adaptive policies with hierarchical reinforcement learning. If you are interested in working with me on related topics, please reach out by email!

Email / Google Scholar / Github / LinkedIn

Research Directions

My primary ongoing research directions include:

Identifying Challenges in Optimizing the Stability-Plasticiy Dilemma: A key direction of my research has been advocating for formulating optimization of the stability-plasticity dilemma within the framework of reinforcement learning in continuing environments. Indeed, the key question we must consider to solve this dilemma is to what degree we expect certain information to be relevant in the long-term future. However, existing approaches are not able to learn efficiently in the presence of high mixing times, which we have shown to be pervasive for continual learning problems.

Towards Continual Reinforcement Learning: A Review and Perspectives [JAIR 2022]
Continual Learning in Environments with Polynomial Mixing Times [NeurIPS 2022]
Balancing Context Length and Mixing Times for Reinforcement Learning at Scale [NeurIPS 2024]

Learning to Continually Learn: My most influential work to date has been the invention of the "meta-experience replay" technique, which combines experience replay with a meta-learning process in which the agent learns to align gradients across continual updates to improve both stability and plasticity when training neural networks. Our recent study further demonstrates that this technique still remains a state-of-the-art approach today that improves on the performance of experience replay for continual pre-training of LLMs while incurring negligible compute overhead.

Dynamic Multi-Agent Interaction: The truth is that the world is largely stationary in the sense that the laws of physics are stationary. When we say that the world is changing, we are generally referring to changes in behavior among agents. As such, I view the study of how to adapt to other agents as they learn as a key aspect of the continual learning problem.

Learning to Teach in Cooperative Multiagent Reinforcement Learning [AAAI 2019]

Awarded Outstanding Student Paper Honorable Mention

Learning Hierarchical Teaching Policies for Cooperative Agents [AAMAS 2020]
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning [ICML 2021]
Influencing Long-Term Behavior in Multiagent Reinforcement Learning [NeurIPS 2022]
Position: Theory of Mind Benchmarks are Broken for Large Language Models [ICML 2025]

Recent Research Papers

Here is a collection of previous published research papers I have collaborated on since 2024:

	Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models Istabrak Abbes, Gopeshh Subbaraj, Matthew Riemer, Nizar Islah, Benjamin Therien, Tsuguchika Tabaru, Hiroaki Kingetsu, Sarath Chandar, and Irina Rish Conference on Lifelong Learning Agents (CoLLAs) 2025 Paper \| Code
	Finding the FrameStack: Learning What to Remember for Non-Markovian Reinforcement Learning Geraud Nangue Tasse, Matthew Riemer, Benjamin Rosman, and Tim Klinger Reinforcement Learning Conference (RLC) 2025 Finding the Frame Workshop Paper
	Combining Domain and Alignment Vectors Provides Better Knowledge-Safety Trade-offs in LLMs Megh Thakkar, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, and Sarath Chandar Association for Computational Linguistics (ACL) 2025 Paper
	EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts Subhajit Chaudhury, Payel Das, Sarath Swaminathan, Georgios Kollias, Elliot Nelson, Khushbu Pahwa, Tejaswini Pedapati, Igor Melnyk, and Matthew Riemer Association for Computational Linguistics (ACL) 2025 Paper
	Position: Theory of Mind Benchmarks are Broken for Large Language Models Matthew Riemer, Zahra Ashktorab, Djallel Bouneffouf, Payel Das, Miao Liu, Justin D Weisz, and Murray Campbell International Conference on Machine Learning (ICML) 2025 Paper
	Enabling Realtime Reinforcement Learning at Scale with Staggered Asynchronous Inference Matthew Riemer, Gopeshh Subbaraj, Glen Berseth, and Irina Rish International Conference on Learning Representations (ICLR) 2025 Paper \| Code \| Blog
	Handling Delay in Realtime Reinforcement Learning Ivan Anokhin, Rishav Rishav, Matthew Riemer, Stephen Chung, Irina Rish, and Samira Ebrahimi Kahou International Conference on Learning Representations (ICLR) 2025 Paper \| Code \| Blog
	Contextual Value Alignment Pierre Dognin, Jesus Rios, Ronny Luss, Prasanna Sattigeri, Miao Liu, Inkit Padhi, Matthew Riemer, Manish Nagireddy, Kush Varshney, Djallel Bouneffouf International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025 Paper
	Balancing Context Length and Mixing Times for Reinforcement Learning at Scale Matthew Riemer, Khimya Khetarpal, Janarthanan Rajendran, and Sarath Chandar Conference on Neural Information Processing Systems (NeurIPS) 2024 Paper \| Code
	A Deep Dive into the Trade-Offs of Parameter Efficient Preference Alignment Techniques Megh Thakkar, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, and Sarath Chandar Association for Computational Linguistics (ACL) 2024 Paper
	ComVas: Contextual Moral Values Alignment System Inkit Padhi, Pierre Dognin, Jesus Rios, Ronny Luss, Swapnaja Achintalwar, Matthew Riemer, Miao Liu, Prasanna Sattigeri, Manish Nagireddy, Kush Varshney, and Djallel Bouneffouf International Joint Conference on Artificial Intelligence (IJCAI) 2024 Paper
	Realtime Reinforcement Learning: Towards Rapid Asynchronous Deployment of Large Models Matthew Riemer, Gopeshh Subbaraj, Glen Berseth, and Irina Rish Reinforcement Learning Conference (RLC) 2024 Finding the Frame Workshop Paper
	Scalable Approaches for a Theory of Many Minds Maximilian Puelma Touzel, Amin Memarian, Matthew Riemer, Andrei Mircea, Andrew Williams, Elin Ahlstrand, Lucas Lehnert, Rupali Bhati, Guillaume Dumas, and Irina Rish Internation Conference on Machine Learning (ICML) 2024 Agentic Markets Workshop Paper