TalkRL: Reinforcement Learning Interviews

Natasha Jaques 2

Informações:

Sinopsis

Hear about why OpenAI cites her work in RLHF and dialog models, approaches to rewards in RLHF, ChatGPT, Industry vs Academia, PsiPhi-Learning, AGI and more!  Dr Natasha Jaques is a Senior Research Scientist at Google Brain. Featured References Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog Natasha Jaques, Asma Ghandeharioun, Judy Hanwen Shen, Craig Ferguson, Agata Lapedriza, Noah Jones, Shixiang Gu, Rosalind Picard  Sequence Tutor: Conservative Fine-Tuning of Sequence Generation Models with KL-control Natasha Jaques, Shixiang Gu, Dzmitry Bahdanau, José Miguel Hernández-Lobato, Richard E. Turner, Douglas Eck  PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning Angelos Filos, Clare Lyle, Yarin Gal, Sergey Levine, Natasha Jaques, Gregory Farquhar  Basis for Intentions: Efficient Inverse Reinforcement Learning using Past Experience Marwa Abdulhai, Natasha Jaques, Sergey Levine  Additional References