Offline-to-Online RL
A training paradigm that initializes a policy offline from a static dataset, then fine-tunes it online with additional environment interaction. The offline phase provides a good initialization that avoids unsafe early exploration; the online phase improves beyond the offline data's performance. IQL and CQL are common offline phases for robot manipulation.
Robot LearningRL