## Team : mathfi

## Section: New Results

**Keywords: ***policy iteration algorithm*, *Howard algorithm*, *dynamic programming*, *nonexpansive maps*.

## Policy iteration algorithms for fixed point problems with nonexpansive operators

**Participants:**J.Ph. Chancelier, M. Messaoud, A. Sulem.

The case of Bellman equations associated to optimal control of Markov chains
on an infinite horizon with discount factor >0 has been studied for a long time
by many authors (see e.g the monographs by Bertsekas and Puterman). Typically these equations are of the form
where
M^{w} is the transition matrix of the Markov chain, c^{w} is the running utility and w is the control variable with values in some control set . We know that the iteration policy algorithm converges to
the solution of the Bellman equation since the
operator
is contractive and satisfies a discrete maximum principle.

The problem addressed here concerns more general fixed point problems on a finite state space. Typically the operator we consider is the maximum of a contractive operator and a nonexpansive one which satisfy some appropriate properties. Shortest path problems also lead to some fixed point problems with nonexpansive operators but in a rather different context whereas reflecting boundaries lead to nonexpansive operators on the boundary. This last problem appears to be a special case of ours. We prove the convergence of an iteration policy algorithm and give illustrating examples in optimal control of Markov chains in [31]

These results can be applied to the numerical analysis of quasi variational inequalities (QVIs) associated to combined impulse/stochastic optimal controls.