William H. Calvin and Derek Bickerton, Lingua ex Machina, chapter 7 (MIT Press)

Email Calvin || Email Bickerton || Calvin Home Page || September 1999

COPY-AND-PASTE CITATION

William H. Calvin and Derek Bickerton, Lingua ex Machina: Reconciling Darwin and Chomsky with the human brain (MIT Press, 2000), chapter 7. See also http://WilliamCalvin.com/LEM/LEMch7.htm

The nonvirtual book is available from amazon.com or direct from MIT Press.

Webbed Reprint Collection

William H. Calvin

University of Washington
Seattle WA 98195-1800 USA

Email || Home || Publications

Hexagonal Mosaics
and Darwin Machines

Brains contain almost 100 billion (10¹¹) neurons in the cerebral cortex alone, and some other parts of the brain have many more neurons. Pyramidal neurons, which are tall neurons with triangular-shaped cell bodies, are the most numerous of the cortical neurons. They have a splendid dendritic tree, ascending a millimeter or two toward the cortical surface and breaking into a number of finer branches, seeking out inputs.

In any case, contemplating the form of the cells was one of my most beloved pleasures. Because even from an aesthetic point of view the nervous tissue has fascinating beauty. Are there in our parks any more elegant and lush trees than the Purkinje neuron in the cerebellum or the so-called psychical cell, that is, the famous cortical pyramidal neuron?

Santiago Ramón y Cajal ,1923

For output, they have a single axon leaving the cell body; it is thinner than the finest spider thread. After going a ways, the axon starts to branch as well, finally breaking into many thousands of terminal branches. Some of the branches terminate in a synapse only 0.1 mm away; others may terminate 1,000 mm away in the spinal cord. Because the synapse mostly works in only one direction, each axon branch is effectively a one-way street B all in the same direction, away from the tall tree of the dendrites and toward the axon endings. Simple back-and-forth circuits nearly always involve several neurons, just as one-way streets tend to come in pairs.

A synapse is a little gap between two adjacent cells, a border-crossing point where perfumelike neurotransmitter molecules are wafted into the no man=s land and signal the specialized sniffers on the dendrite of the downstream cell (that=s why synapses are mostly one-way: only one side releases packets of the neurotransmitter molecules) that something is happening upstream. Release of neurotransmitter happens when a brief electrical impulse (also known as a spike or action potential) is sent from the beginning of the axon near the cell body. It propagates down the axon and out all its fine branches, releasing packets of the neurotransmitter chemical (in these pyramidal neurons, the amino acid called Aglutamate@) from the axon terminals when the impulse finally reaches them.

On the other side of the synapse, the neurotransmitter causes (via that sniffer mechanism) a small voltage change in the downstream neuron known as an excitatory postsynaptic potential (EPSP). A number of such inputs are usually required to trigger an impulse in the downstream neuron; this threshold for impulse production means that the neurons even further downstream aren=t told anything about what=s happening upstream unless a sufficient number of the right types of inputs have coordinated their actions. A chain of neurons represents a cascade of thresholds to be overcome, so most chains are silent much of the time.

The other major type of cortical neuron is called stellate; it lacks the tall dendrite, having only bushy ones, and therefore looks more like a shrub than a tree. It works just like the others except for releasing a different neurotransmitter chemical, typically called GABA. This different scent wafted across the synaptic no-mans-land has an inhibitory action on its downstream neurons, a negative voltage change (the IPSP) that reduces any positive EPSPs.

The dendrites of a pyramidal neuron receive about 10,000 inputs; 8,000 synapses are excitatory, and 2,000 inhibitory. Their actions add and subtract like deposits and withdrawals from a bank account B though sometimes with nonlinear double-your-money features rarely provided by banks. But most synapses are silent at any given moment, with only a few hundred needed to put the neuron through its paces, ranging from barely firing to repeatedly firing as fast as possible. In general, you can think of the pyramidal neurons as the excitatory ones, and the stellates as the inhibitory ones.

To summarize this simplified cellular neurophysiology, voltage changes are the basis of computation (you add and subtract with PSPs, trying to exceed a voltage threshold). Voltage change is also the basis of the communication of the computational result (impulses propagate over long distances, then release neurotransmitter). The synapses are just, so far, a primitive way of passing voltage changes across the gap between adjacent cells, something like changing Italian lira for Swiss francs when crossing the border west of Lake Como.

But synapses are also the most easily altered link in this chain of intracellular events, the major way of adjusting the strength of the excitatory or inhibitory influence, of making an input twice as influential as it was before. Synapses are the volume controls of the brain, what affects how Aloudly@ one neuron Ahears@ another. The arriving impulse can release more packets of neurotransmitter than before. Previously ineffective postsynaptic channels can also be brought on line, augmenting the effect of a standard dose of neurotransmitter via having more sniffers.

This altered strength is usually temporary, fading in seconds or minutes, but it can be made relatively permanent during consolidation. In combination with many other synapses in an ensemble of neurons, it is what records new memories. More indiscriminate changes in synaptic strength, affecting many cells at once, are how most mind-altering chemicals work, such as stimulants and anesthetics, as well as the major psychiatric medications.

Circuits involving a number of neurons can do things that a single neuron cannot. They can create precision timing between impulses, free of jitter in a way that no one lonely unconnected neuron could ever manage to do. They can also create complex patterns between neurons: eighty-eight neurons, each hooked up to a piano key, could play little tunes of some complexity. When I use the term Aspatiotemporal pattern,@ just think of an ensemble of neurons (sometimes more than 88) creating a distinctive little pattern of firing, rather like a line of melody.

Just as any one pixel of a computer screen participates at various times in many different patterns representing different letters and drawings, so any one neuron participates in many committees. Each committee generates a different tune. Though we tend to focus on the neuron as the unit of computation and the synapse as a site of modifiability, the pattern=s the thing, much as it is on a computer screen. To understand how concepts, words, phrases, clauses, and sentences are represented B and how they compete with one another, to shape up quality B we have to understand the elementary patterns on which other patterns build.

Shortly, I=m going to claim that one little tune is the code A for apple and that other tunes (more like symphonies with a number of different voices) create a temporary code for a sentence. Each of those tunes can play from a Akeyboard@ about 300 notes long, whose neurons are contained in a cortical space about 0.5 mm across, shaped like a hexagon. The tuneful hexagons always exist redundantly in a little hexagonal mosaic of identical clones, like that synchronized chorus I mentioned when introducing Pringle=s 1951 insight. I hope that this will serve as motivation to learn the cortical circuitry that serves as the basis for the hexagons that ought to emerge now and then in the electrical activity.

Synchronization of cortical neurons can occur via a number of mechanisms, but the one of most interest for our present purposes involves a peculiar property of the axon of a pyramidal neuron in the upper layers of neocortex. The axon acts like an express train, skipping many intermediate stops, giving off synapses only when about 0.5, 1.0, and 1.5 mm away from the tall dendrite (and sometimes continuing for a few millimeters farther, maintaining the integer multiples of the basic metric, 0.5 mm). These express axons spread sideways in the cortex, remaining within the superficial layers and synapsing mostly with other pyramidal neurons in the superficial layers of neocortex. The same originating pyramidal neuron has, of course, other axon branches (it has nearly 10,000 total) that burrow into the white matter and travel long distances front and back, left and right, but the express train arrangement is a property of their axons that stay near home, not straying very far from their layers of origin. The pyramidal neurons of the deeper cortical layers also have recurrent excitation, but their sideways axon branches don=t seem to have the express train gaps where synapses are omitted (they=re the Aall-stops@ trains).

Neuroanatomists have seen variants on the superficial layer express train patterning in most cortical areas, and in most mammals, that they have examined thus far. I have predicted B in The Cerebral Code B what some of the physiological consequences might be, and when our physiological recording and imaging techniques improve in resolution, we can begin to answer such important questions as how often and where the predicted Darwinian processes occur. For the moment, what follows has the status of a theoretical prediction based on neuroanatomy, not physiological data.

Because neurons of a particular cortical area all have about the same gap length, there=s a good chance of two neurons talking to one another; that is, reciprocally exciting one another. While this might produce the circuitry needed for an impulse to chase its tail, round and round between neurons, I doubt that this is a common occurrence. The most likely consequence of this mutual connectivity is that the two neurons will often fire at about the same time. Many models of coupled oscillators exhibit such entrainment; it was first reported in 1665 by the Dutch physicist Christian Huygens, who noticed that two pendulum clocks sitting on the same shelf would synchronize their ticks via vibrations within a half hour after being started at different times. Fireflies do the same thing much more quickly; whole trees full of fireflies can be observed, flashing in synchrony.

The express train anatomy in the superficial layers of cortex suggests that neurons 0.5 mm apart in triangular arrays might be firing in synchrony on many occasions, even when in-between neurons are not. As the background balance of excitation and inhibition varies, you=d expect to see a given triangular array extend its reach for many millimeters, then contract into just a few nodes, then disappear entirely. These patterns are ephemeral.

Furthermore, there might be another triangular array, 0.2 mm away from the first, firing in synchrony but at a different time than the first triangular array. Indeed, there could be hundreds of triangular arrays, each firing at different times, but I doubt if that happens very often; a dozen seems quite sufficient, just as you=d seldom use more than a dozen keys on a piano for a simple tune, even though 88 are available. Just think of a roomful of player pianos, all playing the same tune in lockstep.

So how many keys does the cortical piano have? Well, the largest collection of active nodes (from all arrays taken together) that has no redundancies (just one member of each active array) could be no larger than a hexagon 0.5 mm across (the corresponding points B say the upper left corners B of a mosaic of hexagonal tiles are always connected by triangular arrays). Within a 0.5 mm hexagon of neocortex are about 30,000 neurons, but they often function together in units of about 100 neurons, each called a minicolumn (the orientation columns of visual cortex are the most familiar example, where all neurons seem to be interested in the same thing, lines and edges tilted at the same angle to the visual vertical). Because there are about 300 units within a hexagon, think of a piano keyboard 300 keys long, each key mapped to a particular minicolumn and sounding whenever a cell in that minicolumn fires. And think of not just one megapiano but a whole chorus of them, expanding to recruit more megapianos nearby.

Once two hexagons get their little tune going (and, you realize, that spatiotemporal pattern is what I was referring to, when I talked of the code A that represented apple), they can recruit neighboring hexagons; this probably happens one triangular array at a time but, when they=ve all recruited new nodes in an adjacent space, it=s as if a hexagon had been cloned.

It=s like a plainchant choir, recruiting additional singers. But it=s all very ephemeral, here one second and gone the next. Yet I consider it the basis of working memory, and an excellent candidate for how a Darwinian process could function in the brain to improve quality. Indeed, I discovered it because I was on the lookout for cortical circuitry that could support Darwin=s recursive bootstrapping of quality, on the time scale of thought and action.

Darwin=s discovery about how evolution could occur in a simple, almost automatic way revolutionized our notions about how complex plants and animals came into being. Though often summarized by Darwin=s phrase, Anatural selection,@ it=s really a process with six essential ingredients; when any are missing, interesting things may still happen but the recursive aspect is missing, what allows the course to be repeated for additional credit. The cortical entrainment circuitry is how you get started with a Darwinian process in the cortex, operating on the time scale of thought and action, shaping up perceptions, ideas, and action plans into higher and higher quality.

A good cloning mechanism is not, of course, the whole Darwinian quality improvement process. So far as I can tell, you need

(1) a characteristic pattern (like that A melody) that can

(2) be copied, with

(3) occasional variations (A') or compounding, where

(4) populations of A and A' compete for a limited territory, their relative success biased by

(5) a multifaceted environment (Darwin=s natural selection), and where

(6) the next round of variants is primarily based on the more successful of the current generation (Darwin=s inheritance principle).

There are some other things, such as sex and environmental fluctuations, which will make the Darwinian process operate faster, but they=re optional B you can get the recursive bootstrapping of quality without them. A lot of things loosely called ADarwinian@ may involve only some of the essentials B say, neural development where a pattern is created by selective removal of connections biased by a multifaceted environment. They=re interesting and very useful, but they exhibit no copying, have no populations to compete, and lack a next generation biased by antecedent success. They=re not able to repeat the course for additional credit B as you can do with a full-fledged Darwinian process. Such recursion is how you bootstrap quality, why we can start with subconscious thoughts as jumbled as our night time dreams and still end up with a sentence of quality or a logical expression. You need a quality bootstrapping mechanism in order to figure out what to do with leftovers in the refrigerator; with successive attempts running through your head as you stand there with the door open, you can often find a Aquality@ scheme, that is, one that doesn=t require another trip to the grocery store.

The focus on what sort of ensemble activity could be cloned (#2) actually serves to define the unit pattern (#1), the spatiotemporal firing pattern in a few hundred minicolumns within a 0.5 mm hexagon. (That is, by the way, how the DNA sequence pattern was discovered to be the genetic message: Crick and Watson were searching for what molecule could be reliably copied during mitosis.) To get variants, cloning needs to be slightly imperfect B and that=s not difficult when hexagonal mosaics are still small. Just as thick fingers might strike two piano keys at once or land on a neighboring key, so variants in the spatiotemporal pattern (#3) can easily arise, particularly when inexcitable hexagons limit the number of neighboring hexagons to only two or three. If the variant Abreeds true@ by cloning, then one can have two different populations that can compete with each other (#4), rather as bluegrass and crabgrass compete for my back yard.

One pattern may do better than the other because of the cortex=s multifaceted environment. Just as bluegrass may do better than crabgrass because of your attempts to cut it regularly, water it, fertilize it, and so forth, so cortex has a number of factors that together allow one pattern to clone territory better than its competitors (#5). They include current sensory inputs to the cortex beneath the competing patterns, the background of neuromodulators (the mix of serotonin, dopamine, norepinephrine, acetylcholine, and a flock of peptides), and the Awashboarded road@ of the synaptic strengths that allow some patterns to resonate well B in other words, memories.

Finally, we need a cortical version of Darwin=s inheritance principle (#6), which will preferentially create the next generation of variants from the more numerous of the current patterns. This happens because large hexagonal mosaics have more perimeter than the smaller, less successful ones B and the periphery is the only place that pattern copying can escape perfect cloning, where they have fewer than six identical neighbors to conform them to the standardized pattern. The mosaic=s periphery is also where the variant A' has an unpatterned territory next door, available for colonization. There A' can Aset up shop@ and go into competition with its parent pattern A. So a more successful pattern has a larger territory B which has more edge length, and therefore more opportunities to generate new variants B than do the less successful of the patterns.

If the background level of neuromodulators or neurotransmitters fades, a large hexagonal mosaic will be broken up into a series of small mosaics, with abandoned territories in between where triangular arrays could not be maintained. This population crash is what happens to animal populations during a drought; it=s also like what happens when rising sea level fragments a low-land population into subpopulations on a series of hilltops, now called islands. All of the sudden, there is a lot more perimeter, many more chances for new variants to get started.

Environmental fluctuations and islands may not be essential for a Darwinian process to recursively bootstrap quality, but they can certainly speed up the process. Judging from the Awaves@ of the EEG, the cortex has lots of excitability fluctuations that cover a few square centimeters of cortex, able to pump up quality via a series of population crashes and re-expansions. If a Darwinian process is to operate quickly enough to produce good results within the behavioral windows of opportunity, it may need all the known catalysts.

Systematic recombination is the other major catalyst. We tend to think of variants as arising from mutations, but evolution wouldn=t have gotten very complex without some more systematic ways of making errors and encouraging new combinations. Bacterial conjugation serves to mix up the genes of several individuals; sex is a more systematic way of doing the same thing.

In cortex, variants may arise at the periphery but there are several ways to perform superpositions of two patterns. The first naturally arises when two different hexagonal mosaics meet: they may override, if sufficiently different; the musical analogy would be two-part singing, just as in the medieval elaboration of plainchant into several voices. Some patterns go together better than others; the same melody displaced up a fifth or an octave works well, and the later development of major and minor scales provide other examples.

Going together well in cortex (Aharmony@) is probably a matter of the multifaceted environment; copying mechanisms could temporarily maintain any two overlapped patterns, but only those combination patterns that resonate well with the Awashboarded road@ of synaptic strengths and current sensory inputs might be able to continue, once imposed copying faded next door and the spatiotemporal firing pattern had to sustain itself like a choir without prompts from the choirmaster.

While the real strength of a Darwinian process is for divergent thinking (creativity, where there is no right answer), it can also be used for convergent thinking when the answer isn=t obvious. Sometimes you have to guess well, as when acoustics or conflicting overheard conversations force you to guess the words that you missed hearing (a substantial problem at the Villa Serbelloni when the dining room is full and dozens of voices are bouncing off the stone walls). So let me use the word-guessing problem as an example of how a Darwinian process can help you make a good (though not necessarily correct) guess.

DB: It=s been reckoned, I=m not sure quite how, that languages are about 50 percent redundant, that is, you could hear only 50 percent of the acoustic signal B provided it=s in your native language B and still understand everything that was intended. Under poor conditions, at cocktail parties or rock concerts, you may lose 50 percent and have to ask people to repeat. Under moderate conditions, you won=t even notice a missing 20 or 30 percent.

Several things help if we hear Ab-or-p, l, unidentified vowel, ck.@ One is acoustic space around words. I mean, the following are all possible English words: black, bleck, blick, block, bluck, plack, pleck, plick, plock, pluck, blag, bleg, blig, blog, blug, plag, pleg, plig, plog, plug. But out of those twenty very similar sounding words, only four are actually kosher English words (plus Ablag,@ which is British criminal slang for robbery with violence), which means if we only partially hear a word, we have many fewer possibilities to sort through.

Whew! So let=s assume that you heard Derek=s ambiguous sound string, Ab-or-p, l, unidentified vowel, ck@ in the context of, say, Aa big _______ dog,@ and you need to develop some candidates for what it might be. The first candidate you encounter may be so good, so Aright@ by pragmatic criteria, that no Darwinian competition is necessary. But let=s assume it is harder, that you need a runoff once you develop a few candidates.

The received sound string will constitute a little tune, once encoded into ensemble firing patterns (which, of course, are quite abstract compared to the sound features, rather like hash codes). Imagine a whole hexagonal mosaic of X, what we=ll call the sensory buffer, abutting a fallow field B a field full of resonances with common words, but with no cloned spatiotemporal patterning at the moment. The first problem is to produce some variant patterns, X', X'', X''', and so forth. That=s most easily done with a series of barriers, each with a slit that temporarily reduces the number of possible neighbors capable of correcting an error in copying. The barrier is simply a string of hexagons with insufficient excitation to support recruitment by expansionistic triangular arrays. The slit is more than two, and less than three, hexagons wide (a bit more than a millimeter); the pair of hexagons in the slit are as excitable as those in the sensory buffer that cloned them. But as they go to clone a vacant hexagon in the fallow cortex to the right, they may make a mistake, perhaps omitting one of the hexagon=s triangular arrays, perhaps committing the thick finger error of hitting upon the wrong minicolumn. Should this error happen in the first two hexagons filled, to the right of the slit, the modified pattern X' may take off, cloning its own territory and conforming all the surrounded hexagons with the new standard pattern X'. When X' passes through another slit, we get more errors that clone themselves, resulting in X'', and so forth. So you get variations on a theme, much as in Derek=s example: black, bleck, blick, block, bluck, plack, pleck, plick, plock, pluck, plug.

But only four variations are likely to find resonances, the ones that have been heard so often in the past that they formed resonances in the synaptic strength patterns underlying the fallow field. And so, only black, block, plug, and pluck are likely to maintain hexagonal territories, once the sensory buffer stops driving the action in the formerly fallow field. We have found some candidates for the ambiguous sound string; now we have to make a decision via a Darwinian copying competition.

DB: Context will help here B if we hear Ab-or-p, l, unidentified vowel, ck@ in the context Aa big ______ dog,@ then we know it=s not likely to be Aa pluck dog@ or Aa block dog,@ it=s most probably Aa black dog.@

Context can eliminate a lot of possibilities, and our heads are full of second-order associations like Ared robin,@ which also appear as resonances (perhaps elsewhere, rather than in the same formerly fallow field; I=ll tell you in a minute how we can transfer this competition to a new playing field in some distant area of the brain). Pretty soon, the hexagons that code for Ablack@ have cloned a lot more territory than the others, simply because there weren=t many resonances for the combinations of Apluck dog@ and Ablock dog.@ The blacks don=t have to wipe out the competition, merely attain enough of a plurality to function in another competition, the one that will be concerned with the meaning of an entire noun phrase. All this presumably happens (we don=t actually know yet) in the half-second or so that it takes to analyze and respond to such pattern recognition tasks. If everything had to be done by trying first one thing and then another, it would probably take minutes, but the brain likely has a lot of neural machinery operating in parallel on the problem.

DB: Another thing is all those articles and prepositions, those near meaningless bits of words that make up so surprisingly much of our speech B these all serve as signposts, so to speak, to the syntactic structure. The structure tells you what class a misheard word probably belongs to, and this in turn reduces the possibilities to very few or often just one.

Variants can also be creative, especially the ones for combination codes; by mixing up features of a horse and a rhinoceros, you can create never-seen creatures such as a unicorn. Indeed, it is the usefulness of the Darwinian process for divergent thinking that is its most intriguing application, with the promise of explaining much of our subconscious evolution of thoughts and how they might be shaped into collections of higher quality than the jumbled juxtapositions of our nighttime dreams. I assume that the sentences we eventually speak start out as low-quality collections, not making good sense B and even when they do, that they need a lot of improvement to put them in a form so that others might understand them. As speakers, we have to find the right word choices and arrangements that help the listener to quickly guess our who-did-what-to-whom mental model.

I=ll address the issue of how we speak a sentence we=ve never spoken before, but after another short dose of cortical neurophysiology: tackling the long-distance issues, important for how we tie together the multimodality aspects of a concept.

The concept sites that we saw in the previous chapters could just be areas with the right resonances. They need not, in this two-level Hebbian view, actually be pure specialists; they need only be sites for the right long-term resonances, with the ephemeral spatiotemporal firing patterns there often representing something else, such as the X'' variants of the Darwinian workspace.

I said that resonances often exhibited Acapture@ effects: let any active pattern come close to the resonance, and it will be forced into the memorized resonance (just think of how the washboarded road forces you to slow down, making the jarring even more prominent). While there can be temporary spatiotemporal patterns that represent the unknowns of the sensory world, the resonances mean that some hexagonal firing patterns like A represent familiar features of a former environment, such as an apple. It=s the widespread repetitions of the resonance that constitute the distributed nature of the memory; you can resurrect the active firing pattern A from any of a number of pairs of hexagons in the utilized regions of cerebral cortex.

Now the problem is: How do you communicate this code to a distant patch of cortex? And the answer to that will also show how on-the-fly trial associations can be done, such as that Ablock dog@ and Ablack dog"

On to the NEXT CHAPTER

Notes and References for this chapter

The nonvirtual book is available from amazon.com or direct from MIT Press.

Email Calvin

Email Bickerton

Book's Table of Contents

Calvin Home Page