Chapter 46. Learning Reinforcement
The essence of playing games is to take the best action given the current situation. A policy expresses what should be done in each situation. Although the environment is usually nondeterministic by nature, most AI algorithms assume that underlying trends can be learned. Reinforcement learning (RL) is a collection of problems whereby the policy must be adjusted by trial and error, according to feedback from the environment.
Numerous problems can be modeled using reinforcement signals from the environment—for example, learning to aim, move, or even play deathmatch. Adaptation allows the animats to act in a more intelligent fashion without needing scripts or other designer assistance.
General optimization strategies (such as genetic algorithms) can be used to solve reinforcement problems, but they do not take advantage of the essence of the problem. Instead, specialized RL algorithms use the reward signal itself to learn the best policy in a more efficient fashion.
This chapter covers the following topics:
In practice, RL can be used in games to learn the best tactics based on trial and error. Intuitively, this can be understood as a learning finite-state machine, as covered in the next chapter.