The document discusses control as inference in Markov decision processes (MDPs) and partially observable MDPs (POMDPs). It introduces optimality variables that represent whether a state-action pair is optimal or not. It formulates the optimal action-value function Q* and optimal value function V* in terms of these optimality variables and the reward and transition distributions. Q* is defined as the log probability of a state-action pair being optimal, and V* is defined as the log probability of a state being optimal. Bellman equations are derived relating Q* and V* to the reward and next state value.
This document discusses methods for automated machine learning (AutoML) and optimization of hyperparameters. It focuses on accelerating the Nelder-Mead method for hyperparameter optimization using predictive parallel evaluation. Specifically, it proposes using a Gaussian process to model the objective function and perform predictive evaluations in parallel to reduce the number of actual function evaluations needed by the Nelder-Mead method. The results show this approach reduces evaluations by 49-63% compared to baseline methods.
This document discusses methods for automated machine learning (AutoML) and optimization of hyperparameters. It focuses on accelerating the Nelder-Mead method for hyperparameter optimization using predictive parallel evaluation. Specifically, it proposes using a Gaussian process to model the objective function and perform predictive evaluations in parallel to reduce the number of actual function evaluations needed by the Nelder-Mead method. The results show this approach reduces evaluations by 49-63% compared to baseline methods.