THE CEREBRAL CODE by William H. Calvin (Chapter 7)

Home Page || amazon.com softcover link || The Calvin Bookshelf || Table of Contents

A book by
William H. Calvin
UNIVERSITY OF WASHINGTON
SEATTLE, WASHINGTON 98195-1800 USA

THE CEREBRAL CODE
Thinking a Thought in the Mosaics of the Mind
Available from MIT Press and amazon.com.
copyright ©1996 by William H. Calvin

7
The Brownian Notion

No image could ever be adequate to the concept of a triangle in general. It would never attain that universality of the concept which renders it valid of all triangles, whether right-angled, obtuse-angled, or acute-angled; it would always be limited to a part only of this sphere. The schema of the triangle can exist nowhere but in thought.

Immanuel Kant, 1781

Starting the second act with Kant’s triangles is my way of reminding myself of the importance of how you pose the questions — and how “answers” often reformulate a question rather than answering it. We reposition the foundations beneath our feet as we grope for a firmer footing. This is particularly evident when we deal with abstractions, when we move beyond the representations of our sensory worlds and of our movements and operate in the realm of meta-representations, such as categories or analogies.

But there are some problems with this. When I was first exposed to the problem of generalizing about specific examples of triangles, I was fresh from a course in set theory, and so “spring loaded” to find subsets and supersets as I looked for a mechanistic foundation for such mental categories (as you’ll see, I might have been better off taking a music appreciation course instead). Things look very different to me today, in large part because of developments in cognitive science involving categorical perception, grammar, schemas, scripts, and metaphors. Darwinian copying competitions, in turn, have provided me with another place to stand, a different footing from which to view all of those various types of categories — and an ability to imagine how we might construct them “on the fly,” even invent new levels of abstraction.

The nature of categories has been discussed at least since the ancient Greeks, but the darwinian process provides a fresh way of looking at them.

Yes, the category is a class but there’s often a prototype — a primitive example that shares a lot of features with other members. Eleanor Rosch talks of a basic level category, such as “dog,” that’s defined by the ease with which children and newcomers acquire the concept. Above this is a superordinate level with more abstract classes, such as “mammal” and “pet.” Below the basic level is a subordinate level, subclasses such as “German Shepherd.”

A thing “is” whatever it gives us least trouble to think it is. There is no other “is” than this.

Samuel Butler (II)

Unique individuals, to which we give proper names such as “Fido,” may be difficult class-of-one categories, requiring much more information, such as the features that distinguish that individual from all others of the class. One should not assume — as I did, fresh from set theory — that individuals or episodes are the primitive unit memories, out of which classes are built. With some exceptions such as an infant’s representation of his mother, the unique individual is probably a late stage of category construction, subject to more errors than the populous categories. A unit memory is often more like a forest than an individual tree — and that means there is no firm line between representations and meta-representations.

Most unit memories are probably the fuzzier categories and their associations, not our societal and set theory units. As Bickerton observes, “Without categories, there can be nothing to attach symbols to, since linguistic symbols, as has been apparent at least since Saussure, do not relate directly to objects in the world, but rather to our concepts of the generalized classes to which raw objects belong. Without associations between stimuli (rather than merely between stimulus and response), there would be no way in which symbols could be attached reliably to concepts.” To use words in a referential manner, you’ve got to recognize them as more than mere labels for objects. They have to be treated as abstract units in a hierarchial network of meanings. And meaning, to the followers of Jean Piaget, is inseparable from experience — meanings are constructed.

Associations between unit memories are, moreover, a test of representation schemes, such as my spatiotemporal firing pattern within a neocortical hexagon, as associations, too, must produce representations under some conditions. An association between various representations, as in the various connotations of a word such as “comb,” might be expressed as physical contiguity or overlap:

    a cluster of representations, in the manner of photographs grouped together on a bulletin board,
    a concatenation of representations, in the manner of genes strung together on a chromosome, or
    a composite made by superposition of representations, in the manner of a double exposure, pastiche, or chimera.

But physical contiguity might not be needed. To what extent might a virtual construction suffice? Mere linkages, as in a distributed data base, parts of which are maintained in diverse locations and assembled only when needed? What would bind them together, in the manner of those colored strings on school bulletin boards that lead to the portraits of the members of the committee?

I suspect that the categories we often use, like those that birds and monkeys can learn, are simpler than the ones that fill the remaining chapters. Feature detection is often fuzzy, with a wide range of acceptable shapes still yielding the same perception — which is, of course, serving the function of categorization. But some of these crude mechanisms may not be very extensible, as they say in the software business. Simple foundations may suffice for limited purposes but other foundations may prove better when you go to add on a second story (in archaeology, a ruin’s wall thickness is considered a clue to its original height).

Relationships, too, are abstractions, ones that particularly concern perspective and orientation, or how we frame and punctuate messages. Analogies, metaphors, similes, parables, and mental models involve the comparing of relationships, as when we make an imperfect analogy between is-bigger-than and is-faster-than, by inferring that bigger-is-faster.

Given that we seem capable of endless levels of abstraction and haven’t yet run out of coding space, we may require a representation scheme that is dimensionally similar for both the elementary (apple) and the high-order (Impressionist Still Life). Happily, in gedankenexperiments, hexagons for cerebral codes seem capable (to anticipate chapter 10) of handling any level of abstraction, meta-metaphors and beyond — even representing the gedankenexperiment itself.

Let us begin with a new category arising from a darwinian competition, order emerging from disorder — what my friend Doug vanderHoof promptly labeled “the Brownian notion” on the analogy to the random movement of dust particles in a beam of sunlight — and the way they can eventually coalesce into “dust bunnies” that lurk in undisturbed corners.

Associative memory is a big subject and I don’t propose to explain why Pavlov’s dog learned to associate the dinner bell with the subsequent appearance of food (there are, one suspects, lots of subcortical ways of doing that). Fancy explanations (such as hexagonal mosaic competitions) aren’t needed for associations per se: very simple invertebrates do associations, even within a single second-order neuron. Nor are fancy explanations needed for categories, as such.

Cerebral cortex, however, has the reputation for doing particularly intricate associative memory tasks, especially in the neocortex that mammals have developed so highly. The superficial layers of neocortex certainly look, from Act I, as if they could run a full-fledged darwinian process at the population level, mimicking the more familiar darwinian processes that operate on longer time scales. This goes far beyond, and seems more robust, than anything that I can imagine emerging from a revisionist two-maps loop. That my neocortical Darwin Machine doesn’t maintain a germ line, but instead looks Larmarkian enough to overlay even inborn wiring patterns, is a very desirable feature for the seat of cultural evolution. Presumably some cortical areas are less plastic than others, retaining most of their innate attractors over the years.

The cloning competition probably isn’t needed as an immediate prelude for making a decision, not for most perceptual tasks or movement programs. The tasks illustrated so far, movement choice and ambiguous object recognition, are well within the capabilities of all of our primate cousins, even tree shrews. The question is whether a Darwin Machine might facilitate higher intellectual function: our syntactic language, our abilities to plan grocery lists and careers, our fondness for making music, for inventing new games to play, and for automatically detecting new patterns among established relationships.

Patterns within patterns is what syntax is all about, what the two-year-old child is detecting via listening before bursting into full sentences with appropriate syntax (and with relatively little overt trial and error). Relations among relations are what metaphor is all about. They’re far more abstract than anything that our closest cousins seem to do, though skillful teachers of apes may yet demonstrate that the ape brain is capable of handling the task. Apes could, for example, merely lack the child’s acquisitiveness for words and their interrelations. Absence of evidence, as the archaeologists are fond of noting, is not evidence of absence.

Superpositions are capable of doing associations within my hexagons framework, and we have already seen how one spatiotemporal pattern may overlay another at a frontier between two competing patterns. There is a row of hexagons getting both patterns in various ratios; I usually slip into talking of them “competing” for the space, but they might simply overlap (their triangular arrays might interdigitate without interaction, just as in my example of compacting a Hebbian cell-assembly on p. 46).

The main problem is whether such overlapped hexagons could ever have an independent existence, that is, be able to clone their own territory and compete. This borderline row is, after all, flanked by already established patterns, limiting its reproductive possibilities in a manner reminiscent of hybrid sterility. Let me illustrate two ways to enlarge this thin belt of composite hexagons and improve the hybrid’s chances for having viable offspring.

Intrinsic horizontal axons are often longer than 0.5 mm, with additional terminals clustering around integer multiples of the local metric. Error correction comes not only from the six immediate neighbors but potentially from another six backing them up. This means that barriers may need to be thicker if error correction is to be escaped.

But it also means that the no-mans-land of composites may be a few units wide, enough for a little territory of its own if it resonated sufficiently, perhaps one that could survive better than either of the originals. If barriers were appropriately located, one can imagine the hybrid colonizing new territory by spreading out the narrow end of the composite territory.

Frontiers sweeping back-and-forth are the other setup for wide belts of composite hexagons, provided they have lingering effects. Think of Alsace, where both French and German are spoken, thanks to the fluctuations of the French-German frontier (four times in the last century). Or, as mentioned earlier, those parts of Belgium speaking both French and Dutch. Louvain — or, if you prefer, Leuven — is said to “be on the language border” but both languages are mutually understood over a much wider belt. The back-and-forth of hexagonal frontiers could leave a “multilingual” belt that is much wider than the no-mans-land of today’s competition.

Note that the first type of belt is in the manner of jam sessions in jazz, operating directly at the level of active cloning of spatiotemporal patterns, overlaying one melody on another. You might acquire a new movement program that allowed you to pat your head and rub your stomach at the same time. But my second wider-belt scheme relies on an intermediary, that of forming a new attractor in the connectivity itself. This could happen without the two active patterns ever being present at the same time (indeed, even if the programs were ordinarily competitors). The novelty starts at the level of attractors, rather than that of performance, almost as if one sheet of music had been photocopied on top of another, and someone then tried to play the composite.

Species that are competitors over ecological time may be mutualists over evolutionary time, each providing a store of genetic variation that can be tapped by the other.

Robert Holt, 1990

Most such superpositions are, of course, unplayable — but not all, and darwinian processes are quite capable of discarding the chaff and retaining the workable for further rounds of variation, shaping up higher quality entities. As work spaces turn over, superpositions ought to happen all the time, as ghostly attractors linger after the synchronized triangular arrays have been silenced. This could function much like the exchange of genetic material between species via retroviruses. Just think of how Charles Ives used those musical snippets, conditioning what followed with the fading attractors of Yankee Doodle.

Basins of attraction, because of their afore-mentioned tendency to capture different initial conditions, have that important feature of a loose-fitting category: the resulting activity pattern is about the same, so the significance is about the same. As Walter Freeman notes:

[N]eocortex may have one or more global attractors with multiple wings. State transitions may occur as brief confinements to a wing of an attractor, followed by release to another....The concept of an attractor and its attendant basin is too rigid, because neocortical dynamics progresses through time by continual changes in state that adapt the cortices to the changing environment. The change constitutes a trajectory in cortical state space, which never returns exactly to a prior state, but which (on receipt of a stimulus, for example) returns sufficiently close to the prior state that cortical output places a target of the transmission into the same basin of attraction as did the prior output.

Such “chaotic itinerancy” might be like the seasonal progress of a peddler, revisiting towns that have changed somewhat since the last visit. Itinerancy emphasizes the recurrence of similar, rather than identical, states.

This brings us back to my sashimi example, where a train of thought serves to stack up fading attractors. This allows a certain type of stage setting move. To access rarely used memories (say, the name of the street just north of your childhood home), it may help to first recall the houses on the block, the playmates, the local school — best of all, the direction on your left as you looked at sunrise. Then, if this stage setting sufficiently molds the approach landscape, you will finally pop into the right basin of attraction and activate the street name’s spatiotemporal pattern, its cerebral code. With some luck, you’ll be able to get a big enough chorus singing it, and eventually pronounce the word (in my case, “80th Street”).

Lingering basins of attraction need not be limited to those produced by the triangular arrays within a hexagonal area. A synaptic modification can, for example, provide a customary “groove” that predicts the path of a familiar moving object. One can exploit the time asymmetry of NMDA potentiation; like lingering mouse trails on a computer screen, the enhancement trails the action and doesn’t lead it. But on replay, interesting things happen.

Such synaptic enhancement can, in models of repeated trials, convert some “place cells” into “future place cells”(their best response becomes centered slightly downstream of where the moving object is currently located). In a feedback or efference copy system, this offset could tend to move an arm back on to a customary path if it was somewhat off the path. This scheme doesn’t demand basins of attraction or hexagonal copying, yet the synaptic modification might well generate them as side effects. It’s an empirical question that awaits answers, but much of the nonhexagonal activity probably modifies actual basins of attraction — and thus changes the chances, making some spatiotemporal patterns easier to clone than others, easier to get started de novo. And reigniting a code is the cortical equivalent of spontaneous combustion.

The neocortical projections of the entorhinal area and amygdala probably function without creating hexagonal mosaics. (Edelman bravely calls these areas “cortical appendages,” thus risking the charge of “Neocorticocentricism!”— but I agree with him for my present purpose of understanding the basis of quality novelty.) Certainly the four widely broadcast neuromodulators (from subcortical neurons having 10 to 100 times more axon terminals per neuron than most) seem to lack the specificity needed to generate hexagonal mosaics, yet they might be very effective at biasing the basins of attraction.

It is only recently that scientists were stunned to discover how much is actually going on inside the brain during sleep. Once scientists had gotten used to their counterintuitive discovery that internal brain functions persist at high levels during sleep, they gave up the idea that the brain itself ever really rests. Then some cells were discovered in the pons whose activity decreased to about half during non-REM sleep and was virtually arrested during REM sleep while the rest of the brain was active at near seizure levels. What did the cells contain? Norepinephrine and serotonin — the amines. . . . When we are awake, these cells fire and secrete amines continuously, which among other things restrains the cholinergic system. The biggest clusters of serotonin cells lie right down the middle of the pons, and the norepinephrine cells lie on either side of them. From these sites, they all project great distances all the way up to the cortex and down to the spinal cord. This reach is much more widespread than that of the acetylcholine system.

J. Allan Hobson, 1994

Giving examples of a category can be challenging. Short of selecting a segregated layer, in the modern manner of CAD program superpositions that allow you to extract the plumbing overlay, how do you decompose one of the hexagonal superpositions?

First of all, while the static diagrams needed for tree-based publication resemble sparsely-filled matrices, ours are not static superpositions in the manner of overprinted characters on a dot-matrix printer. They are spatiotemporal patterns, melodies rather than one crashing chord.

Second, we are not dealing with segregating active patterns here; we want to evoke patterns from embedded attractors. The issue is how the system can dwell around one attractor out of the many that a connectivity implements, a task not unlike how you start running rather than walking — which is, after all, another wing of the multilobed attractor called locomotion. The exemplars of a category may, in effect, be attractor lobes that we need to enter.

Third, we have the distributed database possibility, where pointers are middlemen that link scattered elements. This suggests a series of different representations of the same thing, some more useful for recognition than for recall. Recall is always more difficult than recognition, so let me postpone the question about evoking exemplars until we look at the types of possible representations of a category.

We are familiar with the search that goes from titles to abstracts to full texts, but the nervous system may use other principles in-house. Recollect from chapter 1 how a hashed message digest can be used to find the full text in a database. A hash is like a fingerprint, a unique short-form identifier. Hashing indexes a sparsely-filled high-dimensional space of detailed attributes with stand-ins from a more heavily populated low-dimensional space, the elements of which are highly abstract or even random, compared to what they point to.

One simple application is to create a file name that isn’t already in use, and also isn’t unnecessarily long, since you want a low-dimensional search space that can be scanned rapidly. A hash of a document can simply use the least significant bits of its checksum, or alternatively, the seconds and minutes fields of file creation time stamps. Just check to make sure it isn’t already in use; if so, switch to a different hashing technique and try again. A message digest using more elaborate hashing techniques is exquisitely sensitive to small alterations in the full text, while still remaining fairly short.

Recognition in cognition could simply involve hashing the sensory input with the same algorithm used for memorizing — and then seeing whether this hash matches any of the stored ones. This hash algorithm is, of course, not a truncated checksum but simply, like the abstracting algorithm, the sensory processing abstraction procedures developed by that individual earlier in life, which means that they’re unique to the individual, that everyone does it differently.

What’s likely to be the most useful short form for recalling categories and specifics? A hash is not an abstract, nor is it a short version of the long text that lacks details. An abstract would be the more useful short form on which to build categories. An abstract or prototype category is just the opposite of a message digest, insensitive to details (a basin of attraction allows for the kind of loose fit that an abstract needs).

Sufficient detail for recall per se is, however, another matter. In chapter 5 (p. 95), I discussed the problem of creating a new basin of attraction amidst all the old ones. Remember that every hexagon of the cortical work space has a somewhat different history, because of where frontiers and barriers were located on various past occasions; the apple resonance might not overlap everywhere with that of banana. Each hexagon has a different sashimi layering, from its particular ghostly blackboard of short-term attractors and from its particular long-term memories.

If copying is not actively maintaining a spatiotemporal pattern with error correction, attractors may alter it to fit a locally embedded one. A pattern close to one of a hexagon’s existing attractors will simply be captured, changed to match the one encouraged by an old attractor, and the original lost. It is only in those hexagons where the new pattern doesn’t come close to an existing attractor that it stands a chance of being uniquely embedded in the connectivity.

Larger territories make it more likely that a novel spatiotemporal pattern can escape the straightjacket of old attractors, and thus be successfully memorized, for both the short term and the consolidated long term. Note that this accomplishes the same end as hashing’s search for an existing match (though it does not guarantee a novel short form version, as a typical hashing procedure does, only facilitates it by using a large cloned territory to try out many different hexagons).

And why a short form, rather than the real thing? Judging from the difficulties of handling a category-of-one, such as a proper name, the details inherent in a long form are likely to involve a merry chase around a number of hexagons or global attractor lobes in the process of developing details. That suggests a lot of stage setting moves to arrive at the correct ones, perhaps successively plating the territory with a number of different active patterns to shape the sashimi layers into such a form that the correct basin of attraction is entered. (Even a book with a good index seems to require a lot of page flipping before locating a quotable long form.)

One doesn’t evoke characteristic activity patterns from silence, of course. It’s much more likely to start with random firing patterns that converge onto meaningful ones. Such is what giving exemplars of a category could involve: setting the stage for a detailed category with a series of short forms that warm up the orchestra.

The first few “notes” of the spatiotemporal “melody” — as in my discussion of training loops as a spatiotemporal mimic of learning in building up episodic memories (p. 85) — might suffice as a unique “hash” identifier if they are repeatable from one occasion to the next and, as a group, exhibit a lot of variation.

Whatever the short-form prompt for the long-form completion, this serves to illustrate how categories (including sequences such as novel movements and episodic memories) might decompose into detailed parts. The category representation simply needs to link the short forms. Because of the arbitrariness of this composite pattern, the superposition can be repeated yet again for a category of categories (say, food that includes fruit which includes apple and banana). And again, with food a part of inanimate objects.

Note that there is nothing here that requires a consistent hierarchy: apple and fruit could both be full members of food even though apple is itself a member of fruit. The real world is full of category mistakes and full of shortcuts. I imagine that the brain often uses the equivalent of hypertext links: rather than backing up in a tree hierarchy to go down another major branch, we seem to jump between branches like a squirrel. It’s disorderly, but quick — and good enough usually suffices.

In my musical analogy, one node of a triangular array is one note on the piano keyboard. The hexagon contains the whole keyboard (though in no particular order). The spatiotemporal pattern within the hexagon is a melody.

It might be a one-finger melody like the seventh-century Gregorian chants, slowly progressing to nearby notes. Or perhaps several notes in this hexagon fire together, like the tenth-century plainsong of the medieval church, where some voices sang a fifth (a 3:2 ratio of frequencies, seven semi-tones apart) or an octave (2:1, twelve semi-tones apart) higher than the others, though still moving in lockstep.

Now consider our problem of the spatiotemporal pattern for a category. It could be similar to overlaying different melodic lines, where two voices do not move in parallel (in European music, this finally occurred in the thirteenth-century). Counterpoint and more complicated aspects of harmony raise issues of what goes together besides those octaves and fifths. For example, in the major and minor scales (the basis of western music starting with the baroque period), only certain semi-tones (7 of the 12 in an octave) are thought to go well enough together to make chords.

Going well together might, of course, not be a matter of actual overlap in performance. Thanks to Hebb’s dual trace memory, there’s another possibility. It could be a matter of recalling the spatiotemporal pattern from the spatial-only connectivity: yes, you can temporarily overlay anything, but only certain patterns are likely to stick long enough to be recalled a minute later.

Though I intend music only as a teaching analogy, it reminds us that what goes well together must have a basis in the brain, either innate or acquired. It may be that music will aid us in sorting through the many possible local neural circuits within the hexagons of memory, simply because music does reflect something about mind.

Much of the work of cortex probably isn’t even triangular (arrays form up because of the superficial pyramidal neurons, which are only about 39 percent of all neocortical neurons), and therefore not represented in this musical framework of notes, chords, counterpoint, and choirs. I have not attempted to account for much cortical detail in this Darwin Machines theory; in particular, I have not tried to account for the perceptual transformations or learning that most neocortical theorists address.

Mine is not a more abstract theory, so much as it is a mechanistic-level theory about abstractions themselves. It can even handle long-distance multimodality categories such as comb.