|Optimization and exploratory learning|
have a non-negative relation
|Convergence of learning towards the|
- We prove that the gradients of optimization and exploratory learning satisfy a non-negative relation.
- We prove that any fixpoint of exploratory learning for any data must be an actual inverse model.
- We prove that exploratory learning with goal babbling not only works, but converges to the optimal least-squares solution.
- We show that the basic learning dynamics of goal babbling resemble those of explosive combustion processes. This gives a neat view on the previous finding that goal babbling constitutes a positive feedback loop – and explains the S-shaped learning curves also observed in human learning.
Abstract — We investigate the role of redundancy for exploratory learning of inverse functions, where an agent learns to achieve goals by performing actions and observing outcomes. We present an analysis of linear redundancy and investigate goal-directed exploration approaches, which are empirically successful, but hardly theorized except negative results for special cases, and prove convergence to the optimal solution. We show that the learning curves of such processes are intrinsically low-dimensional and S-shaped, which explains previous empirical findings, and finally compare our results to non-linear domains.