Konten disediakan oleh Robin Ranjit Singh Chauhan. Semua konten podcast termasuk episode, grafik, dan deskripsi podcast diunggah dan disediakan langsung oleh Robin Ranjit Singh Chauhan atau mitra platform podcast mereka. Jika Anda yakin seseorang menggunakan karya berhak cipta Anda tanpa izin, Anda dapat mengikuti proses yang dijelaskan di sini https://id.player.fm/legal.
Player FM - Aplikasi Podcast
Offline dengan aplikasi Player FM !
Offline dengan aplikasi Player FM !
NeurIPS 2019 Deep RL Workshop
MP3•Beranda episode
Manage episode 248458894 series 2536330
Konten disediakan oleh Robin Ranjit Singh Chauhan. Semua konten podcast termasuk episode, grafik, dan deskripsi podcast diunggah dan disediakan langsung oleh Robin Ranjit Singh Chauhan atau mitra platform podcast mereka. Jika Anda yakin seseorang menggunakan karya berhak cipta Anda tanpa izin, Anda dapat mengikuti proses yang dijelaskan di sini https://id.player.fm/legal.
Thank you to all the presenters that participated. I covered as many as I could given the time and crowds, if you were not included and wish to be, please email talkrl@pathwayi.com
More details on the official NeurIPS Deep RL Workshop site.
- 0:23 Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms; Matthia Sabatelli (University of Liege); Gilles Louppe (University of Liège); Pierre Geurts (University of Liège); Marco Wiering (University of Groningen) [external pdf link]
- 4:16 Single Deep Counterfactual Regret Minimization; Eric Steinberger (University of Cambridge).
- 5:38 On the Convergence of Episodic Reinforcement Learning Algorithms at the Example of RUDDER; Markus Holzleitner (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); José Arjona-Medina (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Marius-Constantin Dinu (LIT AI Lab / University Linz ); Sepp Hochreiter (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria).
- 9:33 Objective Mismatch in Model-based Reinforcement Learning; Nathan Lambert (UC Berkeley); Brandon Amos (Facebook); Omry Yadan (Facebook); Roberto Calandra (Facebook).
- 10:51 Option Discovery using Deep Skill Chaining; Akhil Bagaria (Brown University); George Konidaris (Brown University).
- 13:44 Blue River Controls: A toolkit for Reinforcement Learning Control Systems on Hardware; Kirill Polzounov (University of Calgary); Ramitha Sundar (Blue River Technology); Lee Reden (Blue River Technology).
- 14:52 LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games; Leonard Adolphs (ETHZ); Thomas Hofmann (ETH Zurich).
- 16:30 Accelerating Training in Pommerman with Imitation and Reinforcement Learning; Hardik Meisheri (TCS Research); Omkar Shelke (TCS Research); Richa Verma (TCS Research); Harshad Khadilkar (TCS Research).
- 17:27 Dream to Control: Learning Behaviors by Latent Imagination; Danijar Hafner (Google); Timothy Lillicrap (DeepMind); Jimmy Ba (University of Toronto); Mohammad Norouzi (Google Brain) [external pdf link].
- 20:48 Adaptive Temperature Tuning for Mellowmax in Deep Reinforcement Learning; Seungchan Kim (Brown University); George Konidaris (Brown).
- 22:05 Meta-learning curiosity algorithms; Ferran Alet (MIT); Martin Schneider (MIT); Tomas Lozano-Perez (MIT); Leslie Kaelbling (MIT).
- 24:09 Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards; Xingyu Lu (Berkeley); Stas Tiomkin (BAIR, UC Berkeley); Pieter Abbeel (UC Berkeley).
- 25:44 Swarm-inspired Reinforcement Learning via Collaborative Inter-agent Knowledge Distillation; Zhang-Wei Hong (Preferred Networks); Prabhat Nagarajan (Preferred Networks); Guilherme Maeda (Preferred Networks).
- 26:35 Multiplayer AlphaZero; Nicholas Petosa (Georgia Institute of Technology); Tucker Balch (Ga Tech) [external pdf link].
- 27:43 Prioritized Sequence Experience Replay; Marc Brittain (Iowa State University); Joshua Bertram (Iowa State University); Xuxi Yang (Iowa State University); Peng Wei (Iowa State University) [external pdf link].
- 29:14 Recurrent neural-linear posterior sampling for non-stationary bandits; Paulo Rauber (IDSIA); Aditya Ramesh (USI); Jürgen Schmidhuber (IDSIA - Lugano).
- 29:36 Improving Evolutionary Strategies With Past Descent Directions; Asier Mujika (ETH Zurich); Florian Meier (ETH Zurich); Marcelo Matheus Gauy (ETH Zurich); Angelika Steger (ETH Zurich) [external pdf link].
- 31:40 ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations; Daniel Seita (University of California, Berkeley); David Chan (University of California, Berkeley); Roshan Rao (UC Berkeley); Chen Tang (UC Berkeley); Mandi Zhao (UC Berkeley); John Canny (UC Berkeley) [external pdf link].
- 33:05 Bottom-Up Meta-Policy Search; Luckeciano Melo (Aeronautics Institute of Technology); Marcos Máximo (Aeronautics Institute of Technology); Adilson Cunha (Aeronautics Institute of Technology) [external pdf link].
- 33:37 MERL: Multi-Head Reinforcement Learning; Yannis Flet-Berliac (University of Lille / Inria); Philippe Preux (INRIA) [external pdf link].
- 35:30 Emergen...
53 episode
MP3•Beranda episode
Manage episode 248458894 series 2536330
Konten disediakan oleh Robin Ranjit Singh Chauhan. Semua konten podcast termasuk episode, grafik, dan deskripsi podcast diunggah dan disediakan langsung oleh Robin Ranjit Singh Chauhan atau mitra platform podcast mereka. Jika Anda yakin seseorang menggunakan karya berhak cipta Anda tanpa izin, Anda dapat mengikuti proses yang dijelaskan di sini https://id.player.fm/legal.
Thank you to all the presenters that participated. I covered as many as I could given the time and crowds, if you were not included and wish to be, please email talkrl@pathwayi.com
More details on the official NeurIPS Deep RL Workshop site.
- 0:23 Approximating two value functions instead of one: towards characterizing a new family of Deep Reinforcement Learning algorithms; Matthia Sabatelli (University of Liege); Gilles Louppe (University of Liège); Pierre Geurts (University of Liège); Marco Wiering (University of Groningen) [external pdf link]
- 4:16 Single Deep Counterfactual Regret Minimization; Eric Steinberger (University of Cambridge).
- 5:38 On the Convergence of Episodic Reinforcement Learning Algorithms at the Example of RUDDER; Markus Holzleitner (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); José Arjona-Medina (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria); Marius-Constantin Dinu (LIT AI Lab / University Linz ); Sepp Hochreiter (LIT AI Lab, Institute for Machine Learning, Johannes Kepler University Linz, Austria).
- 9:33 Objective Mismatch in Model-based Reinforcement Learning; Nathan Lambert (UC Berkeley); Brandon Amos (Facebook); Omry Yadan (Facebook); Roberto Calandra (Facebook).
- 10:51 Option Discovery using Deep Skill Chaining; Akhil Bagaria (Brown University); George Konidaris (Brown University).
- 13:44 Blue River Controls: A toolkit for Reinforcement Learning Control Systems on Hardware; Kirill Polzounov (University of Calgary); Ramitha Sundar (Blue River Technology); Lee Reden (Blue River Technology).
- 14:52 LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-Based Games; Leonard Adolphs (ETHZ); Thomas Hofmann (ETH Zurich).
- 16:30 Accelerating Training in Pommerman with Imitation and Reinforcement Learning; Hardik Meisheri (TCS Research); Omkar Shelke (TCS Research); Richa Verma (TCS Research); Harshad Khadilkar (TCS Research).
- 17:27 Dream to Control: Learning Behaviors by Latent Imagination; Danijar Hafner (Google); Timothy Lillicrap (DeepMind); Jimmy Ba (University of Toronto); Mohammad Norouzi (Google Brain) [external pdf link].
- 20:48 Adaptive Temperature Tuning for Mellowmax in Deep Reinforcement Learning; Seungchan Kim (Brown University); George Konidaris (Brown).
- 22:05 Meta-learning curiosity algorithms; Ferran Alet (MIT); Martin Schneider (MIT); Tomas Lozano-Perez (MIT); Leslie Kaelbling (MIT).
- 24:09 Predictive Coding for Boosting Deep Reinforcement Learning with Sparse Rewards; Xingyu Lu (Berkeley); Stas Tiomkin (BAIR, UC Berkeley); Pieter Abbeel (UC Berkeley).
- 25:44 Swarm-inspired Reinforcement Learning via Collaborative Inter-agent Knowledge Distillation; Zhang-Wei Hong (Preferred Networks); Prabhat Nagarajan (Preferred Networks); Guilherme Maeda (Preferred Networks).
- 26:35 Multiplayer AlphaZero; Nicholas Petosa (Georgia Institute of Technology); Tucker Balch (Ga Tech) [external pdf link].
- 27:43 Prioritized Sequence Experience Replay; Marc Brittain (Iowa State University); Joshua Bertram (Iowa State University); Xuxi Yang (Iowa State University); Peng Wei (Iowa State University) [external pdf link].
- 29:14 Recurrent neural-linear posterior sampling for non-stationary bandits; Paulo Rauber (IDSIA); Aditya Ramesh (USI); Jürgen Schmidhuber (IDSIA - Lugano).
- 29:36 Improving Evolutionary Strategies With Past Descent Directions; Asier Mujika (ETH Zurich); Florian Meier (ETH Zurich); Marcelo Matheus Gauy (ETH Zurich); Angelika Steger (ETH Zurich) [external pdf link].
- 31:40 ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations; Daniel Seita (University of California, Berkeley); David Chan (University of California, Berkeley); Roshan Rao (UC Berkeley); Chen Tang (UC Berkeley); Mandi Zhao (UC Berkeley); John Canny (UC Berkeley) [external pdf link].
- 33:05 Bottom-Up Meta-Policy Search; Luckeciano Melo (Aeronautics Institute of Technology); Marcos Máximo (Aeronautics Institute of Technology); Adilson Cunha (Aeronautics Institute of Technology) [external pdf link].
- 33:37 MERL: Multi-Head Reinforcement Learning; Yannis Flet-Berliac (University of Lille / Inria); Philippe Preux (INRIA) [external pdf link].
- 35:30 Emergen...
53 episode
Semua episode
×Selamat datang di Player FM!
Player FM memindai web untuk mencari podcast berkualitas tinggi untuk Anda nikmati saat ini. Ini adalah aplikasi podcast terbaik dan bekerja untuk Android, iPhone, dan web. Daftar untuk menyinkronkan langganan di seluruh perangkat.