Robust Markov Decision Process
Báo cáo viên: Mai Anh Tiến (Singapore Management University)

Thời gian: 9h30, Thứ 5 ngày 5 tháng 8 năm 2021

Địa điểm: Phòng 302, Nhà A5 Viện Toán học

link Online


Tóm tắt: Markov decision processes (MDP) are popular in many planning, reinforcement learning and imitation learning applications. Motivated by the fact that policies in MDP are sensitive with respect to the state transition probabilities, and the estimation of these probabilities may be inaccurate, an MDP framework that allows to handle such uncertainties would be relevant and important. In this talk, I will present a robust version of the MDP model, where the optimal policies are required to be robust with respect to the ambiguity in the underlying transition probabilities. I will show that essential properties that hold for the non-robust MDP model also hold in our settings, making the robust MDP problem tractable. We show how our framework and results can be integrated into different algorithmic schemes including value or (modified) policy iteration, which would lead to new robust reinforcement learning and imitation learning algorithms to handle uncertainties. Analyses on computational complexity and error propagation under conventional uncertainty settings are also provided.

Trở lại

Công bố khoa học mới