Inverse RL
Learning a reward function from expert demonstrations, under the assumption that the expert is (approximately) optimally maximizing that reward. IRL avoids the need to hand-engineer reward functions — instead, the reward is inferred and then used to train a policy via standard RL. Maximum entropy IRL and adversarial IRL (AIRL) are popular formulations. IRL is closely related to GAIL.