Model-based reinforcement learning (MBRL) is a variant of the iterative learning framework, reinforcement learning, that includes a structured component of the system that is solely optimized to model the environment dynamics. Learning a model is broadly motivated from biology, optimal control, and more – it is grounded in natural human intuition of planning before acting. This intuitive grounding, however, results in a more complicated learning process. In this post, we discuss how model-based reinforcement learning is more susceptible to parameter tuning and how AutoML can help in finding very well performing parameter settings and schedules. Below, left is the expected behavior of an agent maximizing velocity on a “Half Cheetah” robotic task, and to the right is what our paper with hyperparameter tuning finds.
Model-based reinforcement learning (MBRL) is an iterative framework for solving tasks in a partially understood environment. There is an agent that repeatedly tries to solve a problem, accumulating state and action data. With that data, the agent creates a structured learning tool – a dynamics model – to reason about the world. With the dynamics model, the agent decides how to act by predicting into the future. With those actions, the agent collects more data, improves said model, and hopefully improves future actions.
Humans are pretty poor at internalizing higher-dimensional relationships. Unfortunately, all ML systems come with hyperparameters that have complex higher-dimensional relationships. Manually searching for configurations or schedules that work well is a tedious and unrewarding task, so let’s let a computer do it for us. Automated Machine Learning (AutoML) is a field dedicated to the study of using machine learning algorithms to tune our machine learning tools. However, there have not been many attempts in using AutoML methods for RL so far, (for more on AutoRL see this blog post) even though, given the success of AutoML
This article is purposely trimmed, please visit the source to read the full article.