TalkRL: Reinforcement Learning Interviews

  • Autor: Vários
  • Narrador: Vários
  • Editor: Podcast
  • Duración: 58:29:54
  • Mas informaciones

Informações:

Sinopsis

TalkRL podcast is All Reinforcement Learning, All the time. In-depth interviews with brilliant people at the forefront of RL research and practice. Hosted by Robin Ranjit Singh Chauhan. Technical content.

Episodios

  • Jake Beck, Alex Goldie, & Cornelius Braun on Sutton's OaK, Metalearning, LLMs, Squirrels @ RLC 2025

    19/08/2025 Duración: 12min

    Recorded at Reinforcement Learning Conference 2025 at University of Alberta, Edmonton Alberta Canada.Featured ReferencesLecture on the Oak Architecture, Rich SuttonAlberta Plan, Rich Sutton with Mike Bowling and Patrick Pilarski Additional ReferencesJacob Beck on Google Scholar Alex Goldie on Google ScholarCornelius Braun on Google ScholarReinforcement Learning Conference

  • Outstanding Paper Award Winners - 2/2 @ RLC 2025

    18/08/2025 Duración: 14min

    We caught up with the RLC Outstanding Paper award winners for your listening pleasure. Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.Featured References Empirical Reinforcement Learning ResearchMitigating Suboptimality of Deterministic Policy Gradients in Complex Q-functionsAyush Jain, Norio Kosaka, Xinhu Li, Kyung-Min Kim, Erdem Biyik, Joseph J LimApplications of Reinforcement LearningWOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management StrategiesWilliam Solow, Sandhya Saisubramanian, Alan FernEmerging Topics in Reinforcement LearningTowards Improving Reward Design in RL: A Reward Alignment Metric for RL PractitionersCalarina Muslimani, Kerrick Johnstonbaugh, Suyog Chandramouli, Serena Booth, W. Bradley Knox, Matthew E. TaylorScientific Understanding in Reinforcement LearningMulti-Task Reinforcement Learning Enables Parameter ScalingReginald McLean, Evangelos Chatzaroulas, J K Terry, Isaac Woungang,

  • Outstanding Paper Award Winners - 1/2 @ RLC 2025

    15/08/2025 Duración: 06min

    We caught up with the RLC Outstanding Paper award winners for your listening pleasure.  Recorded on location at Reinforcement Learning Conference 2025, at University of Alberta, in Edmonton Alberta Canada in August 2025.Featured References  Scientific Understanding in Reinforcement Learning  How Should We Meta-Learn Reinforcement Learning Algorithms?  Alexander David Goldie, Zilin Wang, Jakob Nicolaus Foerster, Shimon Whiteson  Tooling, Environments, and Evaluation for Reinforcement Learning  Syllabus: Portable Curricula for Reinforcement Learning Agents  Ryan Sullivan, Ryan Pégoud, Ameen Ur Rehman, Xinchen Yang, Junyun Huang, Aayush Verma, Nistha Mitra, John P Dickerson  Resourcefulness in Reinforcement Learning  PufferLib 2.0: Reinforcement Learning at 1M steps/s  Joseph Suarez  Theory of Reinforcement Learning  Deep Reinforcement Learning with Gradient  Eligibility Traces  Esraa Elelimy, Brett Daley, Andrew Patterson, Marlos C. Machado, Adam White, Martha White  

  • Thomas Akam on Model-based RL in the Brain

    04/08/2025 Duración: 52min

    Prof Thomas Akam is a Neuroscientist at the Oxford University Department of Experimental Psychology.  He is a Wellcome Career Development Fellow and Associate Professor at the University of Oxford, and leads the Cognitive Circuits research group.Featured ReferencesBrain Architecture for Adaptive BehaviourThomas Akam, RLDM 2025 TutorialAdditional ReferencesThomas Akam on Google ScholarUncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control, Nathaniel D Daw, Yael Niv, Peter Dayan, 2005Further analysis of the hippocampal amnesic syndrome: 14-year follow-up study of H. M., Milner, B., Corkin, S., & Teuber, H. L., 1968Internally generated cell assembly sequences in the rat hippocampus, Pastalkova E, Itskov V, Amarasingham A, Buzsáki G. Science. 2008Multi-disciplinary Conference on Reinforcement Learning and Decision 2025

  • Stefano Albrecht on Multi-Agent RL @ RLDM 2025

    22/07/2025 Duración: 31min

    Stefano V. Albrecht was previously Associate Professor at the University of Edinburgh, and is currently serving as Director of AI at startup Deepflow. He is a Program Chair of RLDM 2025 and is co-author of the MIT Press textbook "Multi-Agent Reinforcement Learning: Foundations and Modern Approaches".Featured ReferencesMulti-Agent Reinforcement Learning: Foundations and Modern ApproachesStefano V. Albrecht,  Filippos Christianos,  Lukas SchäferMIT Press, 2024RLDM 2025: Reinforcement Learning and Decision Making ConferenceDublin, IrelandEPyMARL: Extended Python MARL frameworkhttps://github.com/uoe-agents/epymarlBenchmarking Multi-Agent Deep Reinforcement Learning Algorithms in Cooperative TasksGeorgios Papoudakis and Filippos Christianos and Lukas Schäfer and Stefano V. Albrecht

  • Satinder Singh: The Origin Story of RLDM @ RLDM 2025

    25/06/2025 Duración: 05min

    Professor Satinder Singh of Google DeepMind and U of Michigan is co-founder of RLDM.  Here he narrates the origin story of the Reinforcement Learning and Decision Making meeting (not conference).Recorded on location at Trinity College Dublin, Ireland during RLDM 2025.Featured ReferencesRLDM 2025: Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM)June 11-14, 2025 at Trinity College Dublin, IrelandSatinder Singh on Google Scholar

  • NeurIPS 2024 - Posters and Hallways 3

    09/03/2025 Duración: 10min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at NeurIPS 2024 in Vancouver BC Canada.   Featuring  Claire Bizon Monroc from Inria: WFCRL: A Multi-Agent Reinforcement Learning Benchmark for Wind Farm Control  Andrew Wagenmaker from UC Berkeley: Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL  Harley Wiltzer from MILA: Foundations of Multivariate Distributional Reinforcement Learning  Vinzenz Thoma from ETH AI Center: Contextual Bilevel Reinforcement Learning for Incentive Alignment  Haozhe (Tony) Chen & Ang (Leon) Li from Columbia: QGym: Scalable Simulation and Benchmarking of Queuing Network Controllers  

  • NeurIPS 2024 - Posters and Hallways 2

    05/03/2025 Duración: 08min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at NeurIPS 2024 in Vancouver BC Canada.   Featuring  Jonathan Cook from University of Oxford: Artificial Generational Intelligence: Cultural Accumulation in Reinforcement Learning  Yifei Zhou from Berkeley AI Research: DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning  Rory Young from University of Glasgow: Enhancing Robustness in Deep Reinforcement Learning: A Lyapunov Exponent Approach  Glen Berseth from MILA: Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn  Alexander Rutherford from University of Oxford: JaxMARL: Multi-Agent RL Environments and Algorithms in JAX  

  • NeurIPS 2024 - Posters and Hallways 1

    03/03/2025 Duración: 09min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at NeurIPS 2024 in Vancouver BC Canada.   Featuring  Jiaheng Hu of University of Texas: Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning  Skander Moalla of EPFL: No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO  Adil Zouitine of IRT Saint Exupery/Hugging Face : Time-Constrained Robust MDPs  Soumyendu Sarkar of HP Labs : SustainDC: Benchmarking for Sustainable Data Center Control  Matteo Bettini of Cambridge University: BenchMARL: Benchmarking Multi-Agent Reinforcement Learning  Michael Bowling of U Alberta : Beyond Optimism: Exploration With Partially Observable Rewards  

  • Abhishek Naik

    10/02/2025 Duración: 01h21min

    Abhishek Naik was a student at University of Alberta and Alberta Machine Intelligence Institute, and he just finished his PhD in reinforcement learning, working with Rich Sutton.  Now he is a postdoc fellow at the National Research Council of Canada, where he does AI research on Space applications.  Featured References  Reinforcement Learning for Continuing Problems Using Average Reward Abhishek Naik dissertation 2024  Reward Centering Abhishek Naik, Yi Wan, Manan Tomar, Richard S. Sutto 2024   Learning and Planning in Average-Reward Markov Decision Processes Yi Wan, Abhishek Naik, Richard S. Sutton 2020  Discounted Reinforcement Learning Is Not an Optimization Problem Abhishek Naik, Roshan Shariff, Niko Yasui, Hengshuai Yao, Richard S. Sutton 2019  Additional References Explaining dopamine through prediction errors and beyond, Gershman et al 2024 (proposes Differential-TD-like learning mechanism in the brain around Box 4)  

  • Neurips 2024 RL meetup Hot takes: What sucks about RL?

    23/12/2024 Duración: 17min

    What do RL researchers complain about after hours at the bar?  In this "Hot takes" episode, we find out!  Recorded at The Pearl in downtown Vancouver, during the RL meetup after a day of Neurips 2024.  Special thanks to "David Beckham" for the inspiration :)  

  • RLC 2024 - Posters and Hallways 5

    20/09/2024 Duración: 13min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.   Featuring:  0:01 David Radke of the Chicago Blackhawks NHL on RL for professional sports  0:56 Abhishek Naik from the National Research Council on Continuing RL and Average Reward  2:42 Daphne Cornelisse from NYU on Autonomous Driving and Multi-Agent RL  08:58 Shray Bansal from Georgia Tech on Cognitive Bias for Human AI Ad hoc Teamwork  10:21 Claas Voelcker from University of Toronto on Can we hop in general?  11:23 Brent Venable from The Institute for Human & Machine Cognition on Cooperative information dissemination  

  • RLC 2024 - Posters and Hallways 4

    19/09/2024 Duración: 04min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.   Featuring:  0:01  David Abel from DeepMind on 3 Dogmas of RL  0:55 Kevin Wang from Brown on learning variable depth search for MCTS  2:17 Ashwin Kumar from Washington University in St Louis on fairness in resource allocation  3:36 Prabhat Nagarajan from UAlberta on Value overestimation  

  • RLC 2024 - Posters and Hallways 3

    18/09/2024 Duración: 06min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.  Featuring:  0:01 Kris De Asis from Openmind on Time Discretization  2:23 Anna Hakhverdyan from U of Alberta on Online Hyperparameters  3:59 Dilip Arumugam from Princeton on Information Theory and Exploration  5:04 Micah Carroll from UC Berkeley on Changing preferences and AI alignment  

  • RLC 2024 - Posters and Hallways 2

    16/09/2024 Duración: 15min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.  Featuring:  0:01 Hector Kohler from Centre Inria de l'Université de Lille with "Interpretable and Editable Programmatic Tree Policies for Reinforcement Learning"  2:29 Quentin Delfosse from TU Darmstadt on "Interpretable Concept Bottlenecks to Align Reinforcement Learning Agents"  4:15 Sonja Johnson-Yu from Harvard on "Understanding biological active sensing behaviors by interpreting learned artificial agent policies"  6:42 Jannis Blüml from TU Darmstadt on "OCAtari: Object-Centric Atari 2600 Reinforcement Learning Environments"  8:20 Cameron Allen from UC Berkeley on "Resolving Partial Observability in Decision Processes via the Lambda Discrepancy"  9:48 James Staley from Tufts on "Agent-Centric Human Demonstrations Train World Models"  14:54 Jonathan Li from Rensselaer Polytechnic Institute  

  • RLC 2024 - Posters and Hallways 1

    10/09/2024 Duración: 05min

    Posters and Hallway episodes are short interviews and poster summaries.  Recorded at RLC 2024 in Amherst MA.  Featuring:  0:01 Ann Huang from Harvard on Learning Dynamics and the Geometry of Neural Dynamics in Recurrent Neural Controllers  1:37 Jannis Blüml from TU Darmstadt on HackAtari: Atari Learning Environments for Robust and Continual Reinforcement Learning  3:13 Benjamin Fuhrer from NVIDIA on Gradient Boosting Reinforcement Learning  3:54 Paul Festor from Imperial College London on Evaluating the impact of explainable RL on physician decision-making in high-fidelity simulations: insights from eye-tracking metrics  

  • Finale Doshi-Velez on RL for Healthcare @ RCL 2024

    02/09/2024 Duración: 07min

    Finale Doshi-Velez is a Professor at the Harvard Paulson School of Engineering and Applied Sciences.  This off-the-cuff interview was recorded at UMass Amherst during the workshop day of RL Conference on August 9th 2024.   Host notes: I've been a fan of some of Prof Doshi-Velez' past work on clinical RL and hoped to feature her for some time now, so I jumped at the chance to get a few minutes of her thoughts -- even though you can tell I was not prepared and a bit flustered tbh.  Thanks to Prof Doshi-Velez for taking a moment for this, and I hope to cross paths in future for a more in depth interview. References  Finale Doshi-Velez Homepage @ Harvard  Finale Doshi-Velez on Google Scholar  

  • David Silver 2 - Discussion after Keynote @ RCL 2024

    28/08/2024 Duración: 16min

    Thanks to Professor Silver for permission to record this discussion after his RLC 2024 keynote lecture.   Recorded at UMass Amherst during RCL 2024.Due to the live recording environment, audio quality varies.  We publish this audio in its raw form to preserve the authenticity and immediacy of the discussion.   References  AlphaProof announcement on DeepMind's blogDiscovering Reinforcement Learning Algorithms, Oh et al  -- His keynote at RLC 2024 referred to more recent update to this work, yet to be published  Reinforcement Learning Conference 2024  David Silver on Google Scholar  

  • David Silver @ RCL 2024

    26/08/2024 Duración: 11min

    David Silver is a principal research scientist at DeepMind and a professor at University College London.  This interview was recorded at UMass Amherst during RLC 2024.   References  David Silver on Google Scholar 

  • Vincent Moens on TorchRL

    08/04/2024 Duración: 40min

    Dr. Vincent Moens is an Applied Machine Learning Research Scientist at Meta, and an author of TorchRL and TensorDict in pytorch.  Featured References TorchRL: A data-driven decision-making library for PyTorch Albert Bou, Matteo Bettini, Sebastian Dittert, Vikash Kumar, Shagun Sodhani, Xiaomeng Yang, Gianni De Fabritiis, Vincent Moens  Additional References  TorchRL on github  TensorDict Documentation  

página 1 de 4