Nonparametric adaptive control for discrete-time Markov processes with unbounded costs under average criterion
Volume 26 / 1999
Applicationes Mathematicae 26 (1999), 267-280
DOI: 10.4064/am-26-3-267-280
Abstract
We introduce average cost optimal adaptive policies in a class of discrete-time Markov control processes with Borel state and action spaces, allowing unbounded costs. The processes evolve according to the system equations $x_{t+1}=F(x_t,a_t,ξ _t)$, t=1,2,..., with i.i.d. $ℝ^k$-valued random vectors $ξ_t$, which are observable but whose density ϱ is unknown.