First, be sure to read Section 4.3 which presents results for a more complex, "woods-like" environment, and shows that both a complete mapping and optimal (payoff) performance are achieved.

But second, it sounds like you are saying "In my experience, finding and keeping good classifiers is very tough. If you focus on accuracy and not payoff, I'm sure it will be even tougher". I know that problem, but believe it is a consequence of payoff-based fitness and a panmictic GA. Good rules are hard to find because the panmictic GA spends most of its time crossing, fruitlessly, rules from different "niches"; that is, basically unrelated rules. Furthermore, since payoff-based fitness doesn't lead to accurate generalizations, many of the rules produced are "bad" in the sense that while they work in some situations, they are not good in others--that's the curse of overgeneralization.

In contrast, as you have seen, XCS uses accuracy-based fitness and a niche GA. The niche GA, as Booker pointed out in 1982, only crosses rules that are related in the sense of matching the same situation; there are no "freak" crosses. But--you ask--that sounds "inbred". How can that lead to discovery of good rules in some new niche? Answer: all it takes is one rule in a niche to get the process started. That rule can come from covering, if the match set is initially empty there. Or, it can come from generalization via the GA acting in some other niche, resulting in a classifier that matches in the current niche. From that point on, the GA in the current niche will work with the classifiers there until accurate, maximally general rules are evolved for every action choice in the niche. Note that once this has occurred, the "best" rule from the point of view of payoff must also be present.

Actually, one of the nicest things about XCS is that rule discovery is suddenly no longer an appreciable problem. Your question is a very natural one, especially from people with LCS experience. However, I can tell you that with XCS--at least so far--the mapping just fills right in, and quite rapidly.

What's the likelihood of discovering and maintaining such a complete mapping in problems more complex than the boolean multiplexer? In real problems, discovering good rules (where "good" means able to get a large payoff) is difficult. Discovering enough meaningful rules, both good and bad, to form a complete mapping seems quite unlikely, if you only focus on the problem of maintaining the most accurate rules.