Sinopsis
TalkRL podcast is All Reinforcement Learning, All the time. In-depth interviews with brilliant people at the forefront of RL research and practice. Hosted by Robin Ranjit Singh Chauhan. Technical content.
Episodios
-
Jake Beck, Alex Goldie, & Cornelius Braun on Sutton's OaK, Metalearning, LLMs, Squirrels @ RLC 2025
19/08/2025 Duración: 12minRecorded at Reinforcement Learning Conference 2025 at University of Alberta, Edmonton Alberta Canada.Featured ReferencesLecture on the Oak Architecture, Rich SuttonAlberta Plan, Rich Sutton with Mike Bowling and Patrick Pilarski Additional ReferencesJacob Beck on Google Scholar Alex Goldie on Google ScholarCornelius Braun on Google ScholarReinforcement Learning Conference
-
Outstanding Paper Award Winners - 2/2 @ RLC 2025
18/08/2025 Duración: 14minWe caught up with the RLC Outstanding Paper award winners for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.Featured References Empirical Reinforcement Learning ResearchMitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functionsAyush Jain, Norio Kosaka, Xinhu Li, Kyung-Min Kim, Erdem Biyik, Joseph J LimApplications of Reinforcement LearningWOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management StrategiesWilliam Solow, Sandhya Saisubramanian, Alan FernEmerging Topics in Reinforcement LearningTowards Improving Reward Design in RL: A Reward Alignment Metric for RL PractitionersCalarina Muslimani, Kerrick Johnstonbaugh, Suyog Chandramouli, Serena Booth, W. Bradley Knox, Matthew E. TaylorScientific Understanding in Reinforcement LearningMulti-Task Reinforcement Learning Enables Parameter ScalingReginald McLean, Evangelos Chatzaroulas, J K Terry, Isaac Woungang,
-
Outstanding Paper Award Winners - 1/2 @ RLC 2025
15/08/2025 Duración: 06minWe caught up with the RLC Outstanding Paper award winners for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.Featured References Scientific Understanding in Reinforcement Learning How Should We Meta-Learn Reinforcement Learning Algorithms? Alexander David Goldie, Zilin Wang, Jakob Nicolaus Foerster, Shimon Whiteson Tooling, Environments, and Evaluation for Reinforcement Learning Syllabus: Portable Curricula for Reinforcement Learning Agents Ryan Sullivan, Ryan Pégoud, Ameen Ur Rehman, Xinchen Yang, Junyun Huang, Aayush Verma, Nistha Mitra, John P Dickerson Resourcefulness in Reinforcement Learning PufferLib 2.0: Reinforcement Learning at 1M steps/s Joseph Suarez Theory of Reinforcement Learning Deep Reinforcement Learning with Gradient Eligibility Traces Esraa Elelimy, Brett Daley, Andrew Patterson, Marlos C. Machado, Adam White, Martha White
-
Thomas Akam on Model-based RL in the Brain
04/08/2025 Duración: 52minProf Thomas Akam is a Neuroscientist at the Oxford University Department of Experimental Psychology. He is a Wellcome Career Development Fellow and Associate Professor at the University of Oxford, and leads the Cognitive Circuits research group.Featured ReferencesBrain Architecture for Adaptive BehaviourThomas Akam, RLDM 2025 TutorialAdditional ReferencesThomas Akam on Google ScholarUncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nathaniel D Daw, Yael Niv, Peter Dayan, 2005Further analysis of the hippocampal amnesic syndrome: 14-year follow-up study of H. M., Milner, B., Corkin, S., & Teuber, H. L., 1968Internally generated cell assembly sequences in the rat hippocampus, Pastalkova E, Itskov V, Amarasingham A, Buzsáki G. Science. 2008Multi-disciplinary Conference on Reinforcement Learning and Decision 2025
-
Stefano Albrecht on Multi-Agent RL @ RLDM 2025
22/07/2025 Duración: 31minStefano V. Albrecht was previously Associate Professor at the University of Edinburgh, and is currently serving as Director of AI at startup Deepflow. He is a Program Chair of RLDM 2025 and is co-author of the MIT Press textbook "Multi-Agent Reinforcement Learning: Foundations and Modern Approaches".Featured ReferencesMulti-Agent Reinforcement Learning: Foundations and Modern ApproachesStefano V. Albrecht, Filippos Christianos, Lukas SchäferMIT Press, 2024RLDM 2025: Reinforcement Learning and Decision Making ConferenceDublin, IrelandEPyMARL: Extended Python MARL frameworkhttps://github.com/uoe-agents/epymarlBenchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksGeorgios Papoudakis and Filippos Christianos and Lukas Schäfer and Stefano V. Albrecht
-
Satinder Singh: The Origin Story of RLDM @ RLDM 2025
25/06/2025 Duración: 05minProfessor Satinder Singh of Google DeepMind and U of Michigan is co-founder of RLDM. Here he narrates the origin story of the Reinforcement Learning and Decision Making meeting (not conference).Recorded on location at Trinity College Dublin, Ireland during RLDM 2025.Featured ReferencesRLDM 2025: Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM)June 11-14, 2025 at Trinity College Dublin, IrelandSatinder Singh on Google Scholar
-
NeurIPS 2024 - Posters and Hallways 3
09/03/2025 Duración: 10minPosters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Claire Bizon Monroc from Inria: WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control Andrew Wagenmaker from UC Berkeley: Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL Harley Wiltzer from MILA: Foundations of Multivariate Distributional Reinforcement Learning Vinzenz Thoma from ETH AI Center: Contextual Bilevel Reinforcement Learning for Incentive Alignment Haozhe (Tony) Chen & Ang (Leon) Li from Columbia: QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers
-
NeurIPS 2024 - Posters and Hallways 2
05/03/2025 Duración: 08minPosters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Jonathan Cook from University of Oxford: Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning Yifei Zhou from Berkeley AI Research: DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning Rory Young from University of Glasgow: Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach Glen Berseth from MILA: Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn Alexander Rutherford from University of Oxford: JaxMARL: Multi-Agent RL Environments and Algorithms in JAX
-
NeurIPS 2024 - Posters and Hallways 1
03/03/2025 Duración: 09minPosters and Hallway episodes are short interviews and poster summaries. Recorded at NeurIPS 2024 in Vancouver BC Canada. Featuring Jiaheng Hu of University of Texas: Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning Skander Moalla of EPFL: No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs Soumyendu Sarkar of HP Labs : SustainDC: Benchmarking for Sustainable Data Center Control Matteo Bettini of Cambridge University: BenchMARL: Benchmarking Multi-Agent Reinforcement Learning Michael Bowling of U Alberta : Beyond Optimism: Exploration With Partially Observable Rewards
-
Abhishek Naik
10/02/2025 Duración: 01h21minAbhishek Naik was a student at University of Alberta and Alberta Machine Intelligence Institute, and he just finished his PhD in reinforcement learning, working with Rich Sutton. Now he is a postdoc fellow at the National Research Council of Canada, where he does AI research on Space applications. Featured References Reinforcement Learning for Continuing Problems Using Average Reward Abhishek Naik dissertation 2024 Reward Centering Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutto 2024 Learning and Planning in Average-Reward Markov Decision Processes Yi Wan, Abhishek Naik, Richard S. Sutton 2020 Discounted Reinforcement Learning Is Not an Optimization Problem Abhishek Naik, Roshan Shariff, Niko Yasui, Hengshuai Yao, Richard S. Sutton 2019 Additional References Explaining dopamine through prediction errors and beyond, Gershman et al 2024 (proposes Differential-TD-like learning mechanism in the brain around Box 4)
-
Neurips 2024 RL meetup Hot takes: What sucks about RL?
23/12/2024 Duración: 17minWhat do RL researchers complain about after hours at the bar? In this "Hot takes" episode, we find out! Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024. Special thanks to "David Beckham" for the inspiration :)
-
RLC 2024 - Posters and Hallways 5
20/09/2024 Duración: 13minPosters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 David Radke of the Chicago Blackhawks NHL on RL for professional sports 0:56 Abhishek Naik from the National Research Council on Continuing RL and Average Reward 2:42 Daphne Cornelisse from NYU on Autonomous Driving and Multi-Agent RL 08:58 Shray Bansal from Georgia Tech on Cognitive Bias for Human AI Ad hoc Teamwork 10:21 Claas Voelcker from University of Toronto on Can we hop in general? 11:23 Brent Venable from The Institute for Human & Machine Cognition on Cooperative information dissemination
-
RLC 2024 - Posters and Hallways 4
19/09/2024 Duración: 04minPosters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 David Abel from DeepMind on 3 Dogmas of RL 0:55 Kevin Wang from Brown on learning variable depth search for MCTS 2:17 Ashwin Kumar from Washington University in St Louis on fairness in resource allocation 3:36 Prabhat Nagarajan from UAlberta on Value overestimation
-
RLC 2024 - Posters and Hallways 3
18/09/2024 Duración: 06minPosters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Kris De Asis from Openmind on Time Discretization 2:23 Anna Hakhverdyan from U of Alberta on Online Hyperparameters 3:59 Dilip Arumugam from Princeton on Information Theory and Exploration 5:04 Micah Carroll from UC Berkeley on Changing preferences and AI alignment
-
RLC 2024 - Posters and Hallways 2
16/09/2024 Duración: 15minPosters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Hector Kohler from Centre Inria de l'Université de Lille with "Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning" 2:29 Quentin Delfosse from TU Darmstadt on "Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents" 4:15 Sonja Johnson-Yu from Harvard on "Understanding biological active sensing behaviors by interpreting learned artificial agent policies" 6:42 Jannis Blüml from TU Darmstadt on "OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments" 8:20 Cameron Allen from UC Berkeley on "Resolving Partial Observability in Decision Processes via the Lambda Discrepancy" 9:48 James Staley from Tufts on "Agent-Centric Human Demonstrations Train World Models" 14:54 Jonathan Li from Rensselaer Polytechnic Institute
-
RLC 2024 - Posters and Hallways 1
10/09/2024 Duración: 05minPosters and Hallway episodes are short interviews and poster summaries. Recorded at RLC 2024 in Amherst MA. Featuring: 0:01 Ann Huang from Harvard on Learning Dynamics and the Geometry of Neural Dynamics in Recurrent Neural Controllers 1:37 Jannis Blüml from TU Darmstadt on HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning 3:13 Benjamin Fuhrer from NVIDIA on Gradient Boosting Reinforcement Learning 3:54 Paul Festor from Imperial College London on Evaluating the impact of explainable RL on physician decision-making in high-fidelity simulations: insights from eye-tracking metrics
-
Finale Doshi-Velez on RL for Healthcare @ RCL 2024
02/09/2024 Duración: 07minFinale Doshi-Velez is a Professor at the Harvard Paulson School of Engineering and Applied Sciences. This off-the-cuff interview was recorded at UMass Amherst during the workshop day of RL Conference on August 9th 2024. Host notes: I've been a fan of some of Prof Doshi-Velez' past work on clinical RL and hoped to feature her for some time now, so I jumped at the chance to get a few minutes of her thoughts -- even though you can tell I was not prepared and a bit flustered tbh. Thanks to Prof Doshi-Velez for taking a moment for this, and I hope to cross paths in future for a more in depth interview. References Finale Doshi-Velez Homepage @ Harvard Finale Doshi-Velez on Google Scholar
-
David Silver 2 - Discussion after Keynote @ RCL 2024
28/08/2024 Duración: 16minThanks to Professor Silver for permission to record this discussion after his RLC 2024 keynote lecture. Recorded at UMass Amherst during RCL 2024.Due to the live recording environment, audio quality varies. We publish this audio in its raw form to preserve the authenticity and immediacy of the discussion. References AlphaProof announcement on DeepMind's blogDiscovering Reinforcement Learning Algorithms, Oh et al -- His keynote at RLC 2024 referred to more recent update to this work, yet to be published Reinforcement Learning Conference 2024 David Silver on Google Scholar
-
David Silver @ RCL 2024
26/08/2024 Duración: 11minDavid Silver is a principal research scientist at DeepMind and a professor at University College London. This interview was recorded at UMass Amherst during RLC 2024. References David Silver on Google Scholar
-
Vincent Moens on TorchRL
08/04/2024 Duración: 40minDr. Vincent Moens is an Applied Machine Learning Research Scientist at Meta, and an author of TorchRL and TensorDict in pytorch. Featured References TorchRL: A data-driven decision-making library for PyTorch Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni De Fabritiis, Vincent Moens Additional References TorchRL on github TensorDict Documentation