Learning and Feedback Mechanisms
Learning technology has a crucial role to play in the future of game AI. Not only does it improve the intelligence of NPCs within the world, it also increases the productivity of AI developers. When using learning AI for offline optimization or online adaptation, there are also common principles to adhere to.
As a rule of thumb, it's best to use supervised learning whenever possible. It provides the best results by tackling the problem directly (for instance, providing patterns for neural networks to recognize). Reinforcement approaches are not as convenient, but should be the second choice. Because there is a constant stream of rewards, reinforcement learning has the advantage of providing hints of which actions are beneficial to the system. Finally, evolutionary techniques should be considered as the last option, because the learning is only evaluated in episodes.
There are many issues with learning based on feedback, one of which is dealing with realism. It's fairly easy to express functionality in terms of reward and fitness, but it's much tougher to express humanlike behaviors. By the time "realism" has been modeled with feedback, many aspects of the behavior have already been interpreted as equations—so the problem is almost solved without much need for learning.
Even learning functionality can be difficult, regardless of realism. An important—if not essential—concept for dealing with such learning involves feedback with levels of magnitude. The designer can provide hints by including simpler concepts in the feedback function, in a way similar to shaping. The more important the concept, the more weight it carries. For example, Table 49.1 shows an example for learning deathmatch behaviors from scratch; enemy presence, shooting, damage, and—at the highest level—kills are rewarded.