Inverse RL

Learning a reward function from expert demonstrations, under the assumption that the expert is (approximately) optimally maximizing that reward. IRL avoids the need to hand-engineer reward functions — instead, the reward is inferred and then used to train a policy via standard RL. Maximum entropy IRL and adversarial IRL (AIRL) are popular formulations. IRL is closely related to GAIL.

Robot LearningRL

Explore More Terms

Browse the full robotics glossary with 1,000+ terms.

Back to Glossary