Making RL tractable by learning more informative reward functions: example-based control, meta-learning, and normalized maximum likelihood

Diagram of MURAL, our method for learning uncertainty-aware rewards for RL. After the user provides a few examples of desired outcomes, MURAL automatically infers a reward function that takes into account these examples and the age...

Read the whole article here.

Get more robotics info at Robohub

Comments