|A book by|
William H. Calvin
UNIVERSITY OF WASHINGTON
SEATTLE, WASHINGTON 98195-1800 USA
THE CEREBRAL CODE|
Thinking a Thought in the Mosaics of the Mind
Available from MIT Press and amazon.com.
copyright ©1996 by William H. Calvin
The Brownian Notion
Starting the second act with Kants triangles is my way of reminding myself of the importance of how you pose the questions and how answers often reformulate a question rather than answering it. We reposition the foundations beneath our feet as we grope for a firmer footing. This is particularly evident when we deal with abstractions, when we move beyond the representations of our sensory worlds and of our movements and operate in the realm of meta-representations, such as categories or analogies.
But there are some problems with this. When I was first exposed to the problem of generalizing about specific examples of triangles, I was fresh from a course in set theory, and so spring loaded to find subsets and supersets as I looked for a mechanistic foundation for such mental categories (as youll see, I might have been better off taking a music appreciation course instead). Things look very different to me today, in large part because of developments in cognitive science involving categorical perception, grammar, schemas, scripts, and metaphors. Darwinian copying competitions, in turn, have provided me with another place to stand, a different footing from which to view all of those various types of categories and an ability to imagine how we might construct them on the fly, even invent new levels of abstraction.
The nature of categories has been discussed at least since the ancient Greeks, but the
darwinian process provides a fresh way of looking at them.|
Yes, the category is a class but theres often a prototype a primitive example that shares a lot of features with other members. Eleanor Rosch talks of a basic level category, such as dog, thats defined by the ease with which children and newcomers acquire the concept. Above this is a superordinate level with more abstract classes, such as mammal and pet. Below the basic level is a subordinate level, subclasses such as German Shepherd.
Most unit memories are probably the fuzzier categories and their associations, not our societal and set theory units. As Bickerton observes, Without categories, there can be nothing to attach symbols to, since linguistic symbols, as has been apparent at least since Saussure, do not relate directly to objects in the world, but rather to our concepts of the generalized classes to which raw objects belong. Without associations between stimuli (rather than merely between stimulus and response), there would be no way in which symbols could be attached reliably to concepts. To use words in a referential manner, youve got to recognize them as more than mere labels for objects. They have to be treated as abstract units in a hierarchial network of meanings. And meaning, to the followers of Jean Piaget, is inseparable from experience meanings are constructed.
Associations between unit memories are, moreover, a test of representation schemes, such as
my spatiotemporal firing pattern within a neocortical hexagon, as associations, too, must
produce representations under some conditions. An association between various representations, as in the various connotations of a word such as comb, might be expressed as physical contiguity or overlap:|
a cluster of representations, in the manner of photographs grouped together on a bulletin board,But physical contiguity might not be needed. To what extent might a virtual construction suffice? Mere linkages, as in a distributed data base, parts of which are maintained in diverse locations and assembled only when needed? What would bind them together, in the manner of those colored strings on school bulletin boards that lead to the portraits of the members of the committee?
I suspect that the categories we often use, like those that birds and monkeys can learn, are simpler than the ones that fill the remaining chapters. Feature detection is often fuzzy, with a wide range of acceptable shapes still yielding the same perception which is, of course, serving the function of categorization. But some of these crude mechanisms may not be very extensible, as they say in the software business. Simple foundations may suffice for limited purposes but other foundations may prove better when you go to add on a second story (in archaeology, a ruins wall thickness is considered a clue to its original height).
Relationships, too, are abstractions, ones that particularly concern perspective and orientation, or how we frame and punctuate messages. Analogies, metaphors, similes, parables, and mental models involve the comparing of relationships, as when we make an imperfect analogy between is-bigger-than and is-faster-than, by inferring that bigger-is-faster.
Given that we seem capable of endless levels of abstraction and havent yet run out of coding space, we may require a representation scheme that is dimensionally similar for both the elementary (apple) and the high-order (Impressionist Still Life). Happily, in gedankenexperiments, hexagons for cerebral codes seem capable (to anticipate chapter 10) of handling any level of abstraction, meta-metaphors and beyond even representing the gedankenexperiment itself.
Let us begin with a new category arising from a darwinian competition, order emerging from disorder what my friend Doug vanderHoof promptly labeled the Brownian notion on the analogy to the random movement of dust particles in a beam of sunlight and the way they can eventually coalesce into dust bunnies that lurk in undisturbed corners.
Associative memory is a big subject and I dont propose to explain why Pavlovs
dog learned to associate the dinner bell with the subsequent appearance of food (there are, one
suspects, lots of subcortical ways of doing that). Fancy explanations (such as hexagonal mosaic
competitions) arent needed for associations per se: very simple invertebrates do associations, even within a single second-order neuron. Nor are fancy explanations needed for
categories, as such.|
Cerebral cortex, however, has the reputation for doing particularly intricate associative memory tasks, especially in the neocortex that mammals have developed so highly. The superficial layers of neocortex certainly look, from Act I, as if they could run a full-fledged darwinian process at the population level, mimicking the more familiar darwinian processes that operate on longer time scales. This goes far beyond, and seems more robust, than anything that I can imagine emerging from a revisionist two-maps loop. That my neocortical Darwin Machine doesnt maintain a germ line, but instead looks Larmarkian enough to overlay even inborn wiring patterns, is a very desirable feature for the seat of cultural evolution. Presumably some cortical areas are less plastic than others, retaining most of their innate attractors over the years.
The cloning competition probably isnt needed as an immediate prelude for making a decision, not for most perceptual tasks or movement programs. The tasks illustrated so far, movement choice and ambiguous object recognition, are well within the capabilities of all of our primate cousins, even tree shrews. The question is whether a Darwin Machine might facilitate higher intellectual function: our syntactic language, our abilities to plan grocery lists and careers, our fondness for making music, for inventing new games to play, and for automatically detecting new patterns among established relationships.
Patterns within patterns is what syntax is all about, what the two-year-old child is detecting via listening before bursting into full sentences with appropriate syntax (and with relatively little overt trial and error). Relations among relations are what metaphor is all about. Theyre far more abstract than anything that our closest cousins seem to do, though skillful teachers of apes may yet demonstrate that the ape brain is capable of handling the task. Apes could, for example, merely lack the childs acquisitiveness for words and their interrelations. Absence of evidence, as the archaeologists are fond of noting, is not evidence of absence.
Superpositions are capable of doing associations within my hexagons framework, and we
have already seen how one spatiotemporal pattern may overlay another at a frontier between
two competing patterns. There is a row of hexagons getting both patterns in various ratios; I
usually slip into talking of them competing for the space, but they might simply
overlap (their triangular arrays might interdigitate without interaction, just as in my example of
compacting a Hebbian cell-assembly on p. 46).|
The main problem is whether such overlapped hexagons could ever have an independent existence, that is, be able to clone their own territory and compete. This borderline row is, after all, flanked by already established patterns, limiting its reproductive possibilities in a manner reminiscent of hybrid sterility. Let me illustrate two ways to enlarge this thin belt of composite hexagons and improve the hybrids chances for having viable offspring.
Intrinsic horizontal axons are often longer than 0.5 mm, with additional terminals clustering around integer multiples of the local metric. Error correction comes not only from the six immediate neighbors but potentially from another six backing them up. This means that barriers may need to be thicker if error correction is to be escaped.
But it also means that the no-mans-land of composites may be a few units wide, enough for a little territory of its own if it resonated sufficiently, perhaps one that could survive better than either of the originals. If barriers were appropriately located, one can imagine the hybrid colonizing new territory by spreading out the narrow end of the composite territory.
Frontiers sweeping back-and-forth are the other setup for wide belts of composite hexagons, provided they have lingering effects. Think of Alsace, where both French and German are spoken, thanks to the fluctuations of the French-German frontier (four times in the last century). Or, as mentioned earlier, those parts of Belgium speaking both French and Dutch. Louvain or, if you prefer, Leuven is said to be on the language border but both languages are mutually understood over a much wider belt. The back-and-forth of hexagonal frontiers could leave a multilingual belt that is much wider than the no-mans-land of todays competition.
Note that the first type of belt is in the manner of jam sessions in jazz, operating directly at the level of active cloning of spatiotemporal patterns, overlaying one melody on another. You might acquire a new movement program that allowed you to pat your head and rub your stomach at the same time. But my second wider-belt scheme relies on an intermediary, that of forming a new attractor in the connectivity itself. This could happen without the two active patterns ever being present at the same time (indeed, even if the programs were ordinarily competitors). The novelty starts at the level of attractors, rather than that of performance, almost as if one sheet of music had been photocopied on top of another, and someone then tried to play the composite.
Basins of attraction, because of their afore-mentioned tendency to capture different initial conditions, have that important feature of a loose-fitting category: the resulting activity pattern is about the same, so the significance is about the same. As Walter Freeman notes:
This brings us back to my sashimi example, where a train of thought serves to stack up fading attractors. This allows a certain type of stage setting move. To access rarely used memories (say, the name of the street just north of your childhood home), it may help to first recall the houses on the block, the playmates, the local school best of all, the direction on your left as you looked at sunrise. Then, if this stage setting sufficiently molds the approach landscape, you will finally pop into the right basin of attraction and activate the street names spatiotemporal pattern, its cerebral code. With some luck, youll be able to get a big enough chorus singing it, and eventually pronounce the word (in my case, 80th Street).
Lingering basins of attraction need not be limited to those produced by the triangular arrays
within a hexagonal area. A synaptic modification can, for example, provide a customary
groove that predicts the path of a familiar moving object. One can exploit the
time asymmetry of NMDA potentiation; like lingering mouse trails on a computer screen, the
enhancement trails the action and doesnt lead it. But on replay, interesting things
Such synaptic enhancement can, in models of repeated trials, convert some place cells into future place cells(their best response becomes centered slightly downstream of where the moving object is currently located). In a feedback or efference copy system, this offset could tend to move an arm back on to a customary path if it was somewhat off the path. This scheme doesnt demand basins of attraction or hexagonal copying, yet the synaptic modification might well generate them as side effects. Its an empirical question that awaits answers, but much of the nonhexagonal activity probably modifies actual basins of attraction and thus changes the chances, making some spatiotemporal patterns easier to clone than others, easier to get started de novo. And reigniting a code is the cortical equivalent of spontaneous combustion.
The neocortical projections of the entorhinal area and amygdala probably function without creating hexagonal mosaics. (Edelman bravely calls these areas cortical appendages, thus risking the charge of Neocorticocentricism! but I agree with him for my present purpose of understanding the basis of quality novelty.) Certainly the four widely broadcast neuromodulators (from subcortical neurons having 10 to 100 times more axon terminals per neuron than most) seem to lack the specificity needed to generate hexagonal mosaics, yet they might be very effective at biasing the basins of attraction.
Giving examples of a category can be challenging. Short of selecting a segregated layer, in the
modern manner of CAD program superpositions that allow you to extract the plumbing
overlay, how do you decompose one of the hexagonal superpositions?|
First of all, while the static diagrams needed for tree-based publication resemble sparsely-filled matrices, ours are not static superpositions in the manner of overprinted characters on a dot-matrix printer. They are spatiotemporal patterns, melodies rather than one crashing chord.
Second, we are not dealing with segregating active patterns here; we want to evoke patterns from embedded attractors. The issue is how the system can dwell around one attractor out of the many that a connectivity implements, a task not unlike how you start running rather than walking which is, after all, another wing of the multilobed attractor called locomotion. The exemplars of a category may, in effect, be attractor lobes that we need to enter.
Third, we have the distributed database possibility, where pointers are middlemen that link scattered elements. This suggests a series of different representations of the same thing, some more useful for recognition than for recall. Recall is always more difficult than recognition, so let me postpone the question about evoking exemplars until we look at the types of possible representations of a category.
We are familiar with the search that goes from titles to abstracts to full texts, but the nervous
system may use other principles in-house. Recollect from chapter 1 how a hashed message
digest can be used to find the full text in a database. A hash is like a fingerprint, a unique short-form identifier. Hashing indexes a sparsely-filled high-dimensional space of detailed attributes
with stand-ins from a more heavily populated low-dimensional space, the elements of which are
highly abstract or even random, compared to what they point to.|
One simple application is to create a file name that isnt already in use, and also isnt unnecessarily long, since you want a low-dimensional search space that can be scanned rapidly. A hash of a document can simply use the least significant bits of its checksum, or alternatively, the seconds and minutes fields of file creation time stamps. Just check to make sure it isnt already in use; if so, switch to a different hashing technique and try again. A message digest using more elaborate hashing techniques is exquisitely sensitive to small alterations in the full text, while still remaining fairly short.
Recognition in cognition could simply involve hashing the sensory input with the same algorithm used for memorizing and then seeing whether this hash matches any of the stored ones. This hash algorithm is, of course, not a truncated checksum but simply, like the abstracting algorithm, the sensory processing abstraction procedures developed by that individual earlier in life, which means that theyre unique to the individual, that everyone does it differently.
Whats likely to be the most useful short form for recalling categories and specifics? A hash is not an abstract, nor is it a short version of the long text that lacks details. An abstract would be the more useful short form on which to build categories. An abstract or prototype category is just the opposite of a message digest, insensitive to details (a basin of attraction allows for the kind of loose fit that an abstract needs).
Sufficient detail for recall per se is, however, another matter. In chapter 5 (p. 95), I discussed
the problem of creating a new basin of attraction amidst all the old ones. Remember that every
hexagon of the cortical work space has a somewhat different history, because of where frontiers
and barriers were located on various past occasions; the apple resonance might not overlap
everywhere with that of banana. Each hexagon has a different sashimi layering, from its
particular ghostly blackboard of short-term attractors and from its particular long-term
If copying is not actively maintaining a spatiotemporal pattern with error correction, attractors may alter it to fit a locally embedded one. A pattern close to one of a hexagons existing attractors will simply be captured, changed to match the one encouraged by an old attractor, and the original lost. It is only in those hexagons where the new pattern doesnt come close to an existing attractor that it stands a chance of being uniquely embedded in the connectivity.
Larger territories make it more likely that a novel spatiotemporal pattern can escape the straightjacket of old attractors, and thus be successfully memorized, for both the short term and the consolidated long term. Note that this accomplishes the same end as hashings search for an existing match (though it does not guarantee a novel short form version, as a typical hashing procedure does, only facilitates it by using a large cloned territory to try out many different hexagons).
And why a short form, rather than the real thing? Judging from the difficulties of handling a category-of-one, such as a proper name, the details inherent in a long form are likely to involve a merry chase around a number of hexagons or global attractor lobes in the process of developing details. That suggests a lot of stage setting moves to arrive at the correct ones, perhaps successively plating the territory with a number of different active patterns to shape the sashimi layers into such a form that the correct basin of attraction is entered. (Even a book with a good index seems to require a lot of page flipping before locating a quotable long form.)
One doesnt evoke characteristic activity patterns from silence, of course. Its much more likely to start with random firing patterns that converge onto meaningful ones. Such is what giving exemplars of a category could involve: setting the stage for a detailed category with a series of short forms that warm up the orchestra.
The first few notes of the spatiotemporal melody as in
my discussion of training loops as a spatiotemporal mimic of learning in building up episodic
memories (p. 85) might suffice as a unique hash identifier if they are
repeatable from one occasion to the next and, as a group, exhibit a lot of variation.|
Whatever the short-form prompt for the long-form completion, this serves to illustrate how categories (including sequences such as novel movements and episodic memories) might decompose into detailed parts. The category representation simply needs to link the short forms. Because of the arbitrariness of this composite pattern, the superposition can be repeated yet again for a category of categories (say, food that includes fruit which includes apple and banana). And again, with food a part of inanimate objects.
Note that there is nothing here that requires a consistent hierarchy: apple and fruit could both be full members of food even though apple is itself a member of fruit. The real world is full of category mistakes and full of shortcuts. I imagine that the brain often uses the equivalent of hypertext links: rather than backing up in a tree hierarchy to go down another major branch, we seem to jump between branches like a squirrel. Its disorderly, but quick and good enough usually suffices.
In my musical analogy, one node of a triangular array is one note on the piano keyboard. The
hexagon contains the whole keyboard (though in no particular order). The spatiotemporal
pattern within the hexagon is a melody.|
It might be a one-finger melody like the seventh-century Gregorian chants, slowly progressing to nearby notes. Or perhaps several notes in this hexagon fire together, like the tenth-century plainsong of the medieval church, where some voices sang a fifth (a 3:2 ratio of frequencies, seven semi-tones apart) or an octave (2:1, twelve semi-tones apart) higher than the others, though still moving in lockstep.
Now consider our problem of the spatiotemporal pattern for a category. It could be similar to overlaying different melodic lines, where two voices do not move in parallel (in European music, this finally occurred in the thirteenth-century). Counterpoint and more complicated aspects of harmony raise issues of what goes together besides those octaves and fifths. For example, in the major and minor scales (the basis of western music starting with the baroque period), only certain semi-tones (7 of the 12 in an octave) are thought to go well enough together to make chords.
Going well together might, of course, not be a matter of actual overlap in performance. Thanks to Hebbs dual trace memory, theres another possibility. It could be a matter of recalling the spatiotemporal pattern from the spatial-only connectivity: yes, you can temporarily overlay anything, but only certain patterns are likely to stick long enough to be recalled a minute later.
Though I intend music only as a teaching analogy, it reminds us that what goes well together must have a basis in the brain, either innate or acquired. It may be that music will aid us in sorting through the many possible local neural circuits within the hexagons of memory, simply because music does reflect something about mind.
Much of the work of cortex probably isnt even triangular
(arrays form up because of
the superficial pyramidal neurons, which are only about 39 percent of all neocortical neurons),
and therefore not represented in this musical framework of notes, chords, counterpoint, and
I have not attempted to account for much cortical detail in this Darwin Machines theory;
in particular, I have not tried to account for the perceptual transformations or learning that
most neocortical theorists address.|
Mine is not a more abstract theory, so much as it is a mechanistic-level theory about abstractions themselves. It can even handle long-distance multimodality categories such as comb.