copyright ©2000 by William H. Calvin and Derek Bickerton
The nonvirtual book is available from amazon.com or direct from MIT Press.
The nonvirtual book is available from amazon.com or direct from MIT Press.
Webbed Reprint Collection|
William H. Calvin
University of Washington
Email || Home || Publications
The Word Tree as a
Secondary Use of
Throwing=s Segmented Movement Planner
While I suspect that categories can be stored almost anywhere, let me assume that the temporal lobe rearrangements are one physical manifestation of the great expansion of words. There would probably be a lot of social-calculus-style role links associated with the proper names stored in temporal tip, or with the facial expression recognition that is another known specialty of temporal lobe.
I=ll also assume that the frontal lobe is the site where protolanguage=s simple utterances are planned B premotor and prefrontal (that=s the frontal lobe in front of premotor) are the areas most obviously involved in planning novel movements of all sorts, so perhaps they planned protolanguage utterances as well. Judging from its connections to midbrain for the rapid orienting responses, the frontal lobe might also be the home of those closed-class words that express relative location (above, below, in, on, at, by, next to) and relative direction (to, from, through, left, right, up, down).
Even if protolanguage didn=t utter orienting words, hand movements might often be associated with them, as can be seen in modern Italy. Their use as markers of the boundary of a phrase might have followed, much as my pidgin Italian substitutes hand motions for a forgotten word. And because the frontal lobe tends to be involved with planning, perhaps the closed-class words for relative time (before, after, while, and the various indicators of tense) also live there.
So if you want a dichotomy, let me suppose for a moment (this is, of course, an oversimplification) that nouns and adjectives are mostly temporal lobe and that verbs and boundary words are mostly frontal lobe. Even a simple sentence requires their interaction, in this simplification. How does temporal lobe interact with frontal lobe, you ask, to intensify the style of sentence planning? Is there something about that interaction which might provide the Great Leap Forward, when our ancestors finally got their language act together?
Back in our early days at the Villa Serbelloni, Derek, you challenged me to come up with a neural mechanism for the step up from protolanguage to using syntax. I replied that I could actually think of a possibility, nicely illustrating how things might have happened. Here=s a better version of what I was trying to say that day, as we sat outside on those park benches on the terrace, back in the humid heat before the postcard-clear autumn days arrived. And now I have the additional benefit of those ideas from our third week in Bellagio, when it became evident from playing bocce how the argument structure from the social calculus could have helped retrain the frontal lobe=s segmented planner you use for Aget set@ in throwing, for use as a word tree.
Neurophysiologists do come at the language problem from a slightly different angle than most of the linguists do, who are often happy if they can just explain how a sentence might make sense. To me, part of the answer also has to involve the preparatory background to that sentence, the process you use to generate alternatives and make decisions between the better candidates. It=s not enough to explain the structures for making sense of a completed long utterance; you also have to explain the Alittle person in the head@ that generates the ideas and comprehends the input. Otherwise, you wind up with Dennett=s fallacy of the stage on which scenes are played out for some viewer.
And that dualism isn=t far from the point of view taken by the 1977 book that formed my first introduction to the Villa Serbelloni, The Self and Its Brain. In 1972, the philosopher Karl Popper and the neurophysiologist John C. Eccles participated in a conference here in Bellagio on reductionism in biology. They must have liked it because, in September 1974, the two of them returned to hold a series of conversations here, sitting out on this same terrace and listening to an earlier generation of roosters crowing, down the hill in Pescalo. Their talks featured some top-down attempts by Eccles (otherwise, very much of a bottom-up neurophysiologist whose Nobel Prize-winning spinal cord work in the 1950s formed the background for my more humble Ph.D. thesis on spinal cord neurons in the 1960s) to find the interface between his brain and his (very Catholic) immortal soul.
My neocortical Darwin Machine suggests one way that the little-person-inside problem could be avoided, as a Darwinian process provides explicit accounts for creativity, comprehension, subconscious thoughts, and for how attention might shift from a present topic to a new one, all without a central stage, with all the dynamics described in 1880 by William James:
But might the Darwin Machine also support B or even explain B syntax, that great aid to structured thoughts of length and subtlety?
The notion of competing solutions tends to orient you toward explanations that will allow an entire candidate sentence, complete with embedded phrases and clauses, to be compared against another complete candidate, with all its own phrases and clauses also in place. Yes, you may earlier have one phrase competing against an alternative phrase, but you also have to have a way of judging a complete sentence for its quality relative to another candidate sentence. You also need to judge the winner against internalized standards for solutions Agood enough@ so that you can finish, moving on to something else (otherwise, you send it back for further revisions).
But the production problem for structured sentences is far more difficult than the comprehension problem. How, I asked myself, could one possibly judge the whole sentence against alternatives and standards as one prepared to speak? That would involve having to unpack the winner into the motor program needed to pronounce the words in the right order. It had, I saw, to embody a two-level approach, something like the linguists= distinction between deep structure (whether argument- or phrase-based) and the surface structure conventions of the particular language.
DB: We should really clear up this whole business of deep structure and surface structure, because this terminology has passed into the public domain of interdisciplinary discourse, and everybody uses it, regardless of whether or not they understand it. If they don=t understand it, they can now cop to that without losing face, because the distinction=s mythical. We=ve come a long way, baby, but we=ve come round in a circle.
No, that=s not quite fair B we=ve come round in a huge sweep just like the path here in Bellagio that circles the hill where the Villa stands. You suddenly find yourself directly above the place where you were ten minutes ago. You haven=t made any linear progress but you=re way up above where you were and the vista is immense, you can see where you are, how you got there and where you=re going, which you couldn=t before. So couldn=t linguistics have short-cut it, you may ask, scrambling directly up the hill instead of wasting thirty years on this detour? Maybe, maybe not. At places here on the hill in Bellagio there are huge cliffs that you couldn=t hope to climb. The only way to the top is round about and up. Maybe the last thirty years in linguistics were like that.
Don=t take my word for all this. Take Chomsky=s. Yes, in his latest (post-1990) minimalist model, Chomsky has finally removed the distinction, which he himself created, between deep and surface structure. There is now just one level of syntax, which is a Aprojection of the lexicon.@ What this means is that, in the dictionary of your own language that you carry around in your head, stored in the distributed patterns of neural resonances that Bill=s told you about, there is stored along with each word all the features of that word. The features of a word (which form part of what we=ve been calling the role links) include its meaning, its number and gender (if it has any), the word-class or classes it belongs to, its function (if it=s a grammatical morpheme), the thematic roles it assigns (if it=s a verb), and so forth. Some of these features take the form of requirements: for instance, the article Athe@ requires a noun-phrase after it, the auxiliary verb takes only a present or past participle after it (you can say Ais speaking@ or Ais spoken@ but not Ais speak@ or Ais spoke@) and so on. What then happens is that you try to merge words to make larger units B phrases and clauses B by matching features. If the positive features of one match the requirements of the other, you can merge them, and move on to the next merger. If not, to use the latest jargon, Athe derivation crashes@ B that is, you get a spoonful of word salad.
The planning level needed, I thought, a way that the winners of the regional competitions for each component clause and phrase could, like the voices of a symphony or choral work, combine to produce a totality, one that could compete as a whole with other such wholes. The winning totality would then be decomposed in an appropriate order, rather like a reverse-order version of Benjamin Britten=s Young Person=s Guide to the Orchestra where each voice performs separately, then in combination. The readout for speech is a big problem as you must decompose your symphonic combination and do it in an understandable order B for example, Derek=s expected ordering of arguments for the particular language you are currently speaking.
Throwing provides a nice example of two levels, namely the distinction between a plan and its unpacked execution. The current group of scholars and artists at the Villa Serbelloni, after a long day of work, tend to drift downhill to the bocce court on the waterfront for a preprandial game or two. The only problem is climbing back uphill afterwards, a little matter of 86 meters (I checked the topographic maps). It=s like a tall building, with the dining room on the 25th floor and no elevator (but then another excellent dinner restores us).
Bocce is a European game played on a long, flat court; solid balls (about the size of a grapefruit or softball) are rolled toward a small target ball. The idea is to come the closest to the target ball B or, if all else fails, to move the small ball or the competition=s balls by hitting them with a pitched ball of your own. You=ve got two problems: throwing straight at the target, and not throwing too fast or too slow. Getting your ball to stop in the right place is the hard part of bocce because you must carefully regulate your hand=s velocity up to the moment of launch. Beginners always overshoot.
It=s difficult because rapid throwing, hammering, clubbing, and kicking are all ballistic movements; that is, once you get the motion underway, you soon pass a point of no return. Your arm can=t be stopped; you can=t even alter its trajectory. If your shirt sticks to your sweaty arm and tugs a little as you pick up a cup of tea, you have lots of time to correct your arm trajectory before the tea spills. But in the ballistic movements, you can=t correct for the disturbance because there isn=t enough time for the sensory signal to travel into the spinal cord, up to your brain, influence a decision, and then travel back down to the spinal cord and out to the arm muscles. That round trip takes about one-eighth of a second, and a typical dart throw is over and done by then. You have to make the perfect plan as you Aget set@ to throw, then put it into execution.
For an underarm throw, such as used in bocce, motions are generated in easily identified segments. The slowest is the forward motion of your upper body. Then there is the rotation of your upper arm around your shoulder, which moves the elbow forward. But, riding atop that, there is another motion being generated that rotates your lower arm about the elbow. Then there=s the independent rotation around the wrist and, finally, the loosening of the fingers that let the ball fly free when it has reached the right velocity (not too fast, not too slow).
Because this isn=t a standardized throw (as we try to make dart throws and basketball free throws), our Aget set@ task is to discover a multijoint solution that will launch the ball with the velocity that we judge appropriate to the target=s distance (another difficult task, but I=ll assume here that it is perfectly done). There are a number of launch combinations that will suffice (fast shoulder and slow elbow, minimal shoulder and a wrist flick, and so forth), but there are millions of wrong solutions to avoid. Still, Darwin Machines are very good at discarding the nonsensical and shaping up the quality of A4 out of 10@ solutions into Ascore of 9@ solutions.
But the multijoint aspect suggests that the Darwin Machine=s planning task is structured, each joint=s movement depending on the others. Were only body forwards motion to be used with a stiff shoulder and arm, the hand cradling the ball, then you could accelerate your body to the right velocity with your legs and then slow down B whereupon the ball would fly out of its cradle. The launch velocity is simply the hand=s peak velocity.
With a stiff upper and lower arm and cradled ball, the launch velocity is a function of body velocity and the angular velocity of shoulder rotation and the distance from shoulder joint to the ball. The planner would only have to make the sum of body velocity and the velocity added by the shoulder rotation come out right.
Allow the elbow to rotate, and you must add body velocity to the sum of the shoulder angular velocity times the shoulder-to-elbow distance, plus the elbow rotation angular velocity times the elbow-to-ball distance. And so forth for wrist and fingers.
But, since each axis of rotation is itself moving forward B with some velocity that is a sum of the higher-up rotations= velocities B during its own angular rotational motion, the calculation is nested. One expects to see the brain using a structured algorithm looking something like a nested tree with successive merges, finally comparing the score of this solution to your memories of previous throws. The calculation actually isn=t limited to five rotational axes. The shoulder is notoriously mobile, moving relative to the backbone with the activity in the muscles of the neck, back, and chest. The hand too has a few minor axes of rotation. They all complicate the equation, easily moving you off one of the few good solutions onto one of the millions of worthless solutions lurking nearby. Furthermore, you don=t just need one of these planners but quite a number, so many solutions can be tried in parallel. Each would be rated on an arbitrary scale of goodness, and the better-rated solutions varied to create a new generation of candidates B exactly what Darwin Machines are so good at.
The plan is abstract, not a little simulation operating in real time. The resonances against which the judgements are made are pretty abstract, too. But the implementation gets less abstract and closer to an orchestrated set of movement commands. It=s a bit like Chomsky=s older idea of a deep structure and a surface structure, or the common mathematical technique for analyzing communications engineering problems: working in the frequency domain and later converting the results back into the time domain.
The throwing implementation is a different spatiotemporal pattern than that of the code that competes. Instead of the space being the few hundred minicolumns comprising the hexagon, the space is now a long list of muscles, each contracted or relaxed at various times in order to rotate the joints at the requested angular velocities. Temporarily you can imagine each minicolumn as connected to a muscle (though I=m sure that the truth is far more complex).
And, just as I suggested a little tune as an analogy to the spatiotemporal pattern of both the concept=s code and the planner=s code, I can also suggest a different analogy for the performance=s spatiotemporal pattern: just imagine a fireworks display, one where an aerial bomb bursts into a shower, parallel curtainlike lines tracing downward in the sky (each analogous to the changes in the activity of a given muscle in the back-chest-neck group). But in one of the curtainlike descenders, another bomb bursts toward the right, showering activity changes into another muscle group (those that rotate the upper arm forward). Then (all while activity is continuing to change in all the prior groups) an elbow-rotating muscle group bursts into life, then a wrist, and finally a hand-finger group bursts into activity on the right side of the muscle list. So charting things this way, Amuscle space@ is left-to-right, and time is downward, and everything is happening at once, though the new things tend to drift to the right and lower down in the time chart. (We are plotting changes in muscle activity here; there=s a background of firing in all muscles, from when you tighten up everything as you get set to launch). If the muscles are temporarily imagined as the keys on a piano, the music would be a densely packed arpeggio building into a precisely timed finale.
The most crucial events are those that happen late, during periods of high velocity movement; a little error there in timing, and the implementation mistakes are much greater than when everything was moving slowly earlier in the launch sequence. Fortunately, those crucial high velocity muscle changes don=t have to be planned in real-time simulations; the planning machinery can look ahead in Avirtual time,@ as well as backward at the candidate virtual launch so far.
Everything matters: the massive orchestration of all the relevant muscles must be judged as a whole. Does it hit one of the effective solutions (the ball is launched at the velocity that your depth perception judged to be correct for the bocce setting)? Or does it miss, merely because the activity of one muscle was altered at the wrong time?
It=s much like how language=s planner (all those little roles to be satisfied in argument structure) can look at how the utterance works as a whole and then turn the winning version into another one of those fireworks bursts of activity in a set of chest, tongue, and facial movements, each with its own set of physical constraints. The language planner can also look ahead to the completion of the utterance while adjusting the middle parts, just as the throwing planner presumably looks ahead to the crucial high-speed actions when adjusting the plan for the body and shoulder components.
Indeed, might the ballistic movement planner be able to do double duty, serving as the structured utterance planner when not too busy with throwing or hammering? Or might the planner machinery have been cloned (as visual field maps seem to have been)?
Study of the aphasics who have trouble with novel hand-arm movement sequences suggests that cortical space is shared between language and hand-arm. Ojemann=s core areas where both auditory and oral-facial sequences are confused by cortical stimulation (mentioned back on page 65) also suggest shared function. The muscle groups used for the implementations of speech, throwing, and hammering may be different, but the segmented neural machinery of look-ahead-and-behind planning might be shared.
Throwing certainly has the equivalent of nested embedding: the launch sequence involving fingers is embedded in the environment created by wrist rotation, which is within the elbow=s, the elbow=s within the shoulder=s, and all within the context of the slower body motion. (Derek, notice the @order of attachment@ constraints here, perfect for the ordering of your obligatory roles!) So planning is naturally segmented in the manner of that earlier throwing planner diagram, which looks so similar to those diagrams of sentences like AI think I saw him leave to go home@ with all their nesting.
Implementation in language has a series of constraints from conventions that have to be applied when converting the plan into the performance: word order or case markings, order of role attachments to the verb, and so forth. Throwing too has its local conventions, mostly from the length of your arms and the relative strength of your opposing muscles; converting a deep plan into the appropriate fireworks shower has to take all of these conventions into account. There aren=t shared conventions for a whole community of throwers, but all throwers are constrained by the same Newtonian physics of launch and flight trajectories, learned in early life by calibrating body movements during play and by mimicking others.
If a Darwin Machine is used as the throwing planner, then there is another advantage: timing precision. That=s extremely important during the higher velocity parts of the movements, the ones that cause the velocity to peak and fall off at the right time as the hand sweeps through an arc, and thereby launch the ball from the hand at the correct angle from the horizontal. (Speech has a similar set of high-velocity movements where timing is crucial B when timing is off, we say speech is slurred or foreign-sounding.) When the Darwinian throwing-planner competition produces a winner, it incorporates many losing hexagonal mosaics into the winning one, thus creating an even larger plainchant chorus, each hexagon of which is Asinging@ the winning Asong.@
The easy way to cut timing jitter in half is to increase the size of the chorus by a factor of four. (Doubling throwing distance while maintaining your success rate requires reducing jitter eight-fold, which takes a chorus 64 times larger.) A Darwin Machine might be able to shape up quality using a winning hexagonal mosaic of only a hundred hexagons but, if timing jitter matters, it may be advantageous to use much larger playing fields so that the winning mosaics are many times larger.
Ah, you may exclaim (as I once did): that must be the reason why hominid brain size increased fourfold in the last 2.5 million years B it=s for throwing accuracy! Alas, a fourfold increase in mosaic size only buys you an insignificant 25 percent increase in throwing distance. (My general answer to why the brain enlarged is simply by analogy to economics: doing experiments is a lot easier in an expanding economy than in a zero-sum game, when you have to give something up in order to do the more-than-likely-to-fail experiment. And a hominid had to become a jack of all trades to survive the abrupt climate changes superimposed on the ice ages, which were too sudden for slow adaptations to help.) Size, about the only thing that can be measured about ancient brains, might have helped, but it can=t be the main story.
The only practical way to get many-fold increases in hexagonal mosaics is to temporarily borrow cortical territory, much as the expert choir singing the Hallelujah Chorus borrows the inexpert audience.
As ever-larger hexagonal mosaics are created to reduce timing jitter in high-speed throwing, they may secondarily create coherent corticocortical connections. While one naturally thinks of large mosaics being created by expansion into contiguous territory, there is also (to be discussed in my next chapter) a way of borrowing distant cortex via coherent corticocortical connections (rather like singing along via a long-distance conference phone call).
The needs of throwing (where throwing twice as far or twice as fast is always significantly better for, literally, bringing home the bacon) may have driven the evolutionary changes in recruiting helpers, but other uses of the throwing planner might also benefit from them: language, planning for tomorrow, even music. And, if it=s really a shared facility, improvements in anatomy for even better language performance could incidentally make throwing still better yet. (One isn=t, of course, Athrowing@ words B if anything, one is throwing sentences.)
So don=t make the mistake (as numerous people have done since 1981 when I first began discussing the role of accurate throwing in human evolution) of assuming that throwing is the sole driver of this common facility: any of the ballistic skills and higher intellectual functions could have driven its evolution. Some B say, throwing B might have been more important five million years ago, and others more important in the last several ice ages. But they all benefitted each other during the ascent of mind.
The throwing planner has a number of useful characteristics, when viewed from the standpoint of what language needs. It can help shape up quality utterances, because it=s already a Darwin Machine. It can provide planning space for nested embedding of phrases and clauses, because of its nested treelike features. It can help achieve timing precision during high-speed vocalization sequences, again thanks to the Darwin Machine=s final product, the large hexagonal mosaic. And this oversized chorus might, in turn, have seeded Ametastasises@ in remote cortical areas via newly coherent corticocortical paths, cloning virgin territory.
But how might the structured Darwin Machine have interacted with the social calculus way of analyzing sentence structure, identifying arguments of the involved words? Very fruitfully, I suspect, because it can provide a segmented workspace that can house all of those phrases and clauses identified by argument structure. The progressive merging of throwing solutions seems a lot like the progressive merging of phrases and clauses that Derek emphasizes, where argument structure is mapped onto a tree. Training up a co-opted throwing planner into a sophisticated language planner using sharing-influenced argument tags might require a few additional refinements, but it looks like a good foundation B provided that the temporal lobe=s nouns and their tags can readily participate in frontal lobe segmented planning.
DB: I think we have to be a little careful here in the analogy between executing a throw and building a sentence. The things that you regard as embedded in the throw B arm movements, wrist movements and the rest B differ from the phrases and clauses that are embedded in language in more than one way. First, an arm movement is not built up out of wrist movements, and a shoulder movement isn=t built up out of arm movements: we=re talking here about things which, while they share obvious commonalities, are simply different in kind from one another. However, every clause is a collection of phrases, one or more of which may be expanded into a clause, which in turn consists of a collection of phrases, one or more of which may be similarly expanded, and so on indefinitely. The same kinds of things are used over and over. Second, the number of units you use in a throw is finite and strictly limited B there are only so many body parts you can involve B whereas a sentence is potentially infinite, and certainly has no numerical limit.
WHC: Ah, but you are forgetting how arbitrary a hexagonal code is. It can be a movement, a word, a concept combination like Aunicorn,@ a phrase, a clause, even a metaphor. It can represent nonsense, as in a mantra. (Ever hear the mantra that the Jewish Buddhists use to meditate? AOy. Oy. Oy.@) The neural machinery that links together modular movements doesn=t know whether a code is ultimately a movement or a metaphor, it just structures what it is given in the way that a loom weaves yarn. A code for an analogy simply unpacks very differently than does a code for an entity or a state of affairs.
And a numerical limit might be a problem if the planning machinery is fixed in the manner of a railroad switching yard for boxcars (as I once diagramed ten years ago in The Cerebral Symphony, back before I knew the anatomical circuitry via which neocortex can implement a Darwinian process B that was Calvin 2.0, if we=re going to keep track of versions). But rather than a fixed set of tracks and switches, I now think of it as more like a Lego or Erector set, one that allows you to build short squat bushes or tall spindly trees from the same set of building blocks. While there is perhaps some limit on the number of building blocks in the cerebral cortex, redundancy suggests that you could have tradeoffs, simply reducing redundancy (plainchant choir size) in order to have more independent branches (choral Aparts,@ symphonic Avoices@).
The main thing suggesting limits is that Aseven plus or minus two@ human digit span, what makes it so hard to hang onto ten-digit phone numbers long enough to dial them, in comparison to seven-digit numbers. That limit might be the number of totally independent mosaics that can be managed simultaneously, without one dying out or several merging (Achunking@).
There are also analogies to be made between trajectory planning and narrative projections, those blended spaces that Mark Turner talks about in The Literary Mind. Recall his example from the sailing magazine about a Arace@ between two boats whose journeys were actually 140 years apart:
We deal easily with such metaphorical constructions, mapping the old journey onto our trajectory planning for the modern trip to create a Aghost@ lagging behind. Perhaps that=s because we have a lot of mental machinery for representing real trajectories and matching them up with memorized ones of the past, checking for Afit.@ Understanding one story by mapping it onto a more familiar story (that=s what constitutes a parable) shows how we can operate mentally, once we have the structure for syntax and can use it again for even more abstract, beyond-the-sentence constructions.
This promotes Alogic.@ One of the problems, of course, is that mapping can change the input space if you=re not careful, contaminating your model of reality. Just remember what Dostoevsky said in Letters from the Underworld:
|On to the NEXT CHAPTER
Notes and References for this chapter
Copyright ©2000 by
The handheld, traditionally comfortable version of
this book is available from