CIRJE-F-7. Matsushima, Hitoshi, "Learning about Stochastic Payoff Structures", June 1998.
The situation in which a decision maker is confronted with decision making problems infinitely many times is considered. She does not know the state-dependent stochastic payoffs, and learns from past experiences according to some adaptive learning rule. She is motivated by the maximization of the subjective expected payoff, and never experiments with actions. We show that the decision maker comes to choose only one action in the long run, irrespective of which states she anticipates are likely to occur. This result holds even though she can almost perfectly monitor the true state. We give a characterization and argue that the action chosen in the long run may be objectively maximin.