However, as the classifier length goes up, the probability of cloning would appear to fall, as you note. I think there are at least two assumptions that have to be examined here:
(1) Will classifier length go up that much in "real" problems? Suppose you have a robot. Will designers cope with the potential detail of "reality" by indefinitely adding bits to the visual input? Or will they instead use more limited, or heterogeneous, resolution plus some form of active vision in which the robot's view changes in order to gather detail? I am suggesting that the ultimate need for huge classifier lengths is not proven.
(2) It is also not clear that, even if huge lengths are used, the mutation rate should stay the same. Perhaps it should go down, and in a way that is approximately inverse to the length. It is sometimes said that Nature arranges for about one mutation per genotype. If that is a good rule for classifier systems, then the probability of cloning will be independent of length.
I like macroclassifiers because, at least in the present problems, they are effective (i.e., the numerosities of high fitness macros indeed grow quite dramatically). Besides making matching more efficient, the macros allow the researcher to see much more rapidly what is really going on in the population in terms of "winning" classifiers--they are the ones with high numerosities. Finally, the length M of the population, considered as a list of unique macroclassifiers, is a nice measure of the degree of generalization that has occurred, and of the space complexity of the system's model.
The macroclassifier's mechanism seems quite odd in non-toy problems. Even with a genotype length as short as 20, the probability of generating 2 identical rules is nearly null. So what good is it?