Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs

Liu, Guan Ting; Hu, En Pei; PU-JEN CHENG; HUNG-YI LEE; Sun, Shao Hua

Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs

Journal

Proceedings of Machine Learning Research

Journal Volume

202

Date Issued

2023-01-01

Author(s)

Liu, Guan Ting

Hu, En Pei

PU-JEN CHENG

HUNG-YI LEE

Sun, Shao Hua

URI

https://scholars.lib.ntu.edu.tw/handle/123456789/636993

URL

https://api.elsevier.com/content/abstract/scopus_id/85174410787

Abstract

Aiming to produce reinforcement learning (RL) policies that are human-interpretable and can generalize better to novel scenarios, Trivedi et al. (2021) present a method (LEAPS) that first learns a program embedding space to continuously parameterize diverse programs from a pre-generated program dataset, and then searches for a task-solving program in the learned program embedding space when given a task. Despite the encouraging results, the program policies that LEAPS can produce are limited by the distribution of the program dataset. Furthermore, during searching, LEAPS evaluates each candidate program solely based on its return, failing to precisely reward correct parts of programs and penalize incorrect parts. To address these issues, we propose to learn a meta-policy that composes a series of programs sampled from the learned program embedding space. By learning to compose programs, our proposed hierarchical programmatic reinforcement learning (HPRL) framework can produce program policies that describe out-of-distributionally complex behaviors and directly assign credits to programs that induce desired behaviors. The experimental results in the Karel domain show that our proposed framework outperforms baselines. The ablation studies confirm the limitations of LEAPS and justify our design choices.

Type

conference paper

Hierarchical Programmatic Reinforcement Learning via Learning to Compose Programs

關於 (About)

聯絡資訊 (Contact Us)

相關網站 (Useful Links)

關於開放取用 (Open Access, OA)

出版社期刊論文授權政策 (Copyright)

使用說明 (Instructions)

登入說明 (Sign-in)

匯入著作 (Submission)