SOTAVerified

A Multi-Step Minimax Q-learning Algorithm for Two-Player Zero-Sum Markov Games

2024-07-05Code Available0· sign in to hype

Shreyas S R, Antony Vijesh

Code Available — Be the first to reproduce this paper.

Reproduce

Code

Abstract

An interesting iterative procedure is proposed to solve a two-player zero-sum Markov games. Under suitable assumption, the boundedness of the proposed iterates is obtained theoretically. Using results from stochastic approximation, the almost sure convergence of the proposed two-step minimax Q-learning is obtained theoretically. More specifically, the proposed algorithm converges to the game theoretic optimal value with probability one, when the model information is not known. Numerical simulation authenticate that the proposed algorithm is effective and easy to implement.

Tasks

Reproductions