Representation of Classifiers
Head and Body
Typically, the condition of the classifiers is modeled as a ternary string (that is, each symbol can have three values). The ternary representation includes the standard binary values of 0 and 1, as well as a "don't care" symbol denoted #—allowing the system to generalize. The length of this ternary condition string is the same as the length of the binary input string (usually perceptions of the animat). By using the "don't care" symbol #, it's possible to match the inputs with the conditions of classifiers. A match occurs when corresponding bits are equal or one of the symbols is #.
As for the classifiers' actions, they are generally encoded as binary strings, corresponding to the effectors of the animat. Classifiers have specific actions, so no abstraction is needed on the outputs (that is, without # symbols).
Each classifier also stores a prediction, which corresponds to the expected return if that action is taken. Return can be defined as the accumulation of the reward over time (for instance, armor units successfully collected in the game). Essentially, return measures how good this particular classifier is in the long term.
Together with prediction, we also store its estimated accuracy. It's important to measure the accuracy of a prediction to emphasize the consistency of a classifier. If the classifier has low accuracy, it's just no good at predictions (regardless of the return). In this case, how can we trust it as a policy? On the other hand, rules with high accuracy demonstrate understanding of the world, and therefore are more likely to succeed when applied.
Before Wilson's XCS system [Wilson94] (a groundbreaking classifier system), the fitness of a classifier was based on its prediction of the return. Essentially, the system rewards how well a classifier thinks it does. This is often inconvenient because very general classifiers tend to have high return predictions, but are particularly inaccurate. All encompassing rules are therefore encouraged, implying the classifier system does not find the best representation, but rather a blur of generic concepts. Instead, using the accuracy as the fitness allows the best set of classifiers to be found. This is, in fact, a key insight into modern LCSs.
To compute the return and the prediction error, a counter is used for each rule. When the counter is 0, the values are not initialized. As the counter increases, so does the accuracy of the values. (More learning examples are available.)