Subject: Q1.4: What's a Classifier System (CFS)?
The name of the Game
First, a word on naming conventions is due, for no other paradigm of
EC has undergone more changes to its name space than this one.
Initially, Holland called his cognitive models "Classifier Systems"
abbrv. with CS, and sometimes CFS, as can be found in [GOLD89].
Whence Riolo came into play in 1986 and Holland added a reinforcement
component to the overall design of a CFS, that emphasized its ability
to learn. So, the word "learning" was prepended to the name, to make:
"Learning Classifier Systems" (abbrv. to LCS). On October 6-9, 1992
the "1st Inter'l Workshop on Learning Classifier Systems" took place
at the NASA Johnson Space Center, Houston, TX. A summary of this
"summit" of all leading researchers in LCS can be found on ENCORE
(See Q15.3) in file CFS/papers/lcs92.ps.gz
Today, the story continues, LCSs are sometimes subsumed under a "new"
machine learning paradigm called "Evolutionary Reinforcement
Learning" or ERL for short, incorporating LCSs, "Q-Learning", devised
by Watkins (1989), and a paradigm of the same name, devised by Ackley
and Littman [ALIFEIII].
And then, this latter statement is really somewhat confusing, as
Marco Dorigo has pointed out in a letter to editors of this guide,
since Q-Learning has no evolutionary component. So please let the
Senior Editor explain: When I wrote this part of the guide, I just
had in mind that Q-Learning would make for a pretty good replacement
of Holland's bucket-brigade algorithm, so I used this litte
speculation to see what comes out of it; in early December 95, almost
two years later, it has finally caught Marco's attention. But
meanwhile, I have been proven right: Wilson has suggested to use Q-
Learning in CLASSIFIER SYSTEM (Wilson 1994) and Dorigo & Bersini
(1994) have shown that Q-Learning and the bucket-brigade are truly
equivalent concepts.
We would therefore be allowed to call a CFS that uses Q-Learning for
rule discovery, rather than a bucket-brigade, a Q-CFS, Q-LCS, or Q-
CS; in any case would the result be subsumed under the term ERL, even
if Q-Learning itself is not an EVOLUTIONARY ALGORITHM!
Interestingly, Wilson called his system ZCS (apparantly no "Q"
inside), while Dorigo & Bersini called their system a D-Max-VSCS, or
"discounted max very simple classifier system" (and if you know Q-
Learning at least the D-Max part of the name will remind you of that
algorithm). The latter paper can be found on Encore (see Q15.3) in
file CFS/papers/sab94.ps.gz
And by the way in [HOLLAND95] the term "classifier system" is not
used anymore. You cannot find it in the index. Its a gone! Holland
now stresses the adaptive component of his invention, and simply
calls the resulting systems adaptive agents. These agents are then
implemented within the framework of his recent invention called ECHO.
(See http://www.santafe.edu/projects/echo for more.)
Alwyn Barry's LCS Web has a great deal of information on LCS. See:
http://www.csm.uwe.ac.uk/~ambarry/LCSWEB/
On Schema Processors and ANIMATS
So, to get back to the question above, "What are CFSs?", we first
might answer, "Well, there are many interpretations of Holland's
ideas...what do you like to know in particular?" And then we'd start
with a recitation from [HOLLAND75], [HOLLAND92], and explain all the
SCHEMA processors, the broadcast language, etc. But, we will take a
more comprehensive, and intuitive way to understand what CLASSIFIER
SYSTEMs are all about.
The easiest road to explore the very nature of classifier systems, is
to take the animat (ANIMAl + ROBOT = ANIMAT) "lane" devised by Booker
(1982) and later studied extensively by Wilson (1985), who also
coined the term for this approach. Work continues on animats but is
often regarded as ARTIFICIAL LIFE rather than EVOLUTIONARY
COMPUTATION. This thread of research has even its own conference
series: "Simulation of Adaptive Behavior (SAB)" (cf Q12), including
other notions from machine learning, connectionist learning,
evolutionary robotics, etc. [NB: the latter is obvious, if an animat
lives in a digital microcosm, it can be put into the real world by
implantation into an autonomous robot vehicle, that has
sensors/detectors (camera eyes, whiskers, etc.) and effectors
(wheels, robot arms, etc.); so all that's needed is to use our
algorithm as the "brain" of this vehicle, connecting the hardware
parts with the software learning component. For a fascinating intro
to the field see, e.g. Braitenberg (1984)]
classifier systems, however, are yet another offspring of John
Holland's aforementioned book, and can be seen as one of the early
applications of GAs, for CFSs use this EVOLUTIONARY ALGORITHM to
adapt their behavior toward a changing ENVIRONMENT, as is explained
below in greater detail.
Holland envisioned a cognitive system capable of classifying the
goings on in its environment, and then reacting to these goings on
appropriately. So what is needed to build such a system? Obviously,
we need (1) an environment; (2) receptors that tell our system about
the goings on; (3) effectors, that let our system manipulate its
environment; and (4) the system itself, conveniently a "black box" in
this first approach, that has (2) and (3) attached to it, and "lives"
in (1).
In the animat approach, (1) usually is an artificially created
digital world, e.g. in Booker's Gofer system, a 2 dimensional grid
that contains "food" and "poison". And the Gofer itself, that walks
across this grid and tries (a) to learn to distinguish between these
two items, and (b) survive well fed.
Much the same for Wilson's animat, called "*". Yes, its just an
asterisk, and a "Kafka-esque naming policy" is one of the sign posts
of the whole field; e.g. the first implementation by Holland and
Reitmann 1978 was called CS-1, (cognitive system 1); Smith's Poker
player LS-1 (1980) followed this "convention". Riolo's 1988 LCS
implementations on top of his CFS-C library (cf Q20), were dubbed
FSW-1 (Finite State World 1), and LETSEQ-1 (LETter SEQuence predictor
1).
So from the latter paragraph we can conclude that environment can
also mean something completely different (e.g. an infinite stream of
letters, time serieses, etc.) than in the animat approach, but
anyway; we'll stick to it, and go on.
Imagine a very simple animat, e.g. a simplified model of a frog.
Now, we know that frogs live in (a) Muppet Shows, or (b) little
ponds; so we chose the latter as our demo environment (1); and the
former for a non-Kafka-esque name of our model, so let's dub it
"Kermit".
Kermit has eyes, i.e. sensorial input detectors (2); hands and legs,
i.e. environment-manipulating effectors (3); is a spicy-fly-
detecting-and-eating device, i.e. a frog (4); so we got all the 4
pieces needed.
Inside the Black Box
The most primitive "black box" we can think of is a computer. It has
inputs (2), and outputs (3), and a message passing system inbetween,
that converts (i.e., computes), certain input messages into output
messages, according to a set of rules, usually called the "program"
of that computer. From the theory of computer science, we now borrow
the simplest of all program structures, that is something called
"production system" or PS for short. A PS has been shown to be
computationally complete by Post (1943), that's why it is sometimes
called a "ost System", and later by Minsky (1967). Although it
merely consists of a set of if-then rules, it still resembles a full-
fledged computer.
We now term a single "if-then" rule a "classifier", and choose a
representation that makes it easy to manipulate these, for example by
encoding them into binary strings. We then term the set of
classifiers, a "classifier population", and immediately know how to
breed new rules for our system: just use a GA to generate new
rules/classifiers from the current POPULATION!
All that's left are the messages floating through the black box.
They should also be simple strings of zeroes and ones, and are to be
kept in a data structure, we call "the message list".
With all this given, we can imagine the goings on inside the black
box as follows: The input interface (2) generates messages, i.e., 0/1
strings, that are written on the message list. Then these messages
are matched against the condition-part of all classifiers, to find
out which actions are to be triggered. The message list is then
emptied, and the encoded actions, themselves just messages, are
posted to the message list. Then, the output interface (3) checks
the message list for messages concerning the effectors. And the cycle
restarts.
Note, that it is possible in this set-up to have "internal messages",
because the message list is not emptied after (3) has checked; thus,
the input interface messages are added to the initially empty list.
(cf Algorithm CFS, LCS below)
The general idea of the CFS is to start from scratch, i.e., from
tabula rasa (without any knowledge) using a randomly generated
classifier population, and let the system learn its program by
induction, (cf Holland et al. 1986), this reduces the input stream to
recurrent input patterns, that must be repeated over and over again,
to enable the animat to classify its current situation/context and
react on the goings on appropriately.
What does it need to be a frog?
Let's take a look at the behavior emitted by Kermit. It lives in its
digital microwilderness where it moves around randomly. [NB: This
seemingly "random" behavior is not that random at all; for more on
instinctive, i.e., innate behavior of non-artificial animals see,
e.g. Tinbergen (1951)]
Whenever a small flying object appears, that has no stripes, Kermit
should eat it, because its very likely a spicy fly, or other flying
insect. If it has stripes, the insect is better left alone, because
Kermit had better not bother with wasps, hornets, or bees. If Kermit
encounters a large, looming object, it immediately uses its effectors
to jump away, as far as possible.
So, part of these behavior patterns within the "pond world", in AI
sometimes called a "frame," from traditional knowledge representation
techniques, Rich (1983), can be expressed in a set of "if <condition>
then <action>" rules, as follows:
IF small, flying object to the left THEN send @
IF small, flying object to the right THEN send %
IF small, flying object centered THEN send $
IF large, looming object THEN send !
IF no large, looming object THEN send *
IF * and @ THEN move head 15 degrees left
IF * and % THEN move head 15 degrees right
IF * and $ THEN move in direction head pointing
IF ! THEN move rapidly away from direction head pointing
Now, this set of rules has to be encoded for use within a CLASSIFIER
SYSTEM. A possible encoding of the above rule set in CFS-C (Riolo)
classifier terminology. The condition part consists of two
conditions, that are combined with a logical AND, thus must be met
both to trigger the associated action. This structure needs a NOT
operation, (so we get NAND, and know from hardware design, that we
can build any computer solely with NANDs), in CFS-C this is denoted
by the `~' prefix character (rule #5).
IF THEN
0000, 00 00 00 00
0000, 00 01 00 01
0000, 00 10 00 10
1111, 01 ## 11 11
~1111, 01 ## 10 00
1000, 00 00 01 00
1000, 00 01 01 01
1000, 00 10 01 10
1111, ## ## 01 11
Obviously, string `0000' denotes small, and `00' in the fist part of
the second column, denotes flying. The last two bits of column #2
encode the direction of the object approaching, where `00' means
left, `01' means right, etc.
In rule #4 a the "don't care symbol" `#' is used, that matches `1'
and `0', i.e., the position of the large, looming object, is
completely arbitrary. A simple fact, that can save Kermit's
(artificial) life.
|