William H. Calvin and Derek Bickerton, Lingua ex machina: Reconciling Darwin and Chomsky with the human brain (MIT Press, 2000), Bickerton's linguistics appendix. See also http://WilliamCalvin.com/LEM/LEMappendix.htm
Notes and References for the appendix
Webbed Reprint Collection
William H. Calvin
University of Washington
Email || Home || Publications
In this appendix I shall try to show that the core phenomena explained by a Chomskyan universal grammar can be derived directly from the exaptation of a social calculus, plus a theta-role hierarchy, the Baldwin effects of the exaptation, and a procedure for joining meaningful units. The stakes are quite high. If I fail in this attempt, then a substantial part of this book must be dead wrong. If I succeed, then the account of language evolution given in this book is strongly confirmed.
An enterprise of this nature is novel as well as risky. People who have tried to account for the evolution of language have paid little if any attention to the details of syntactic analysis; people who have studied syntax in any depth have paid little if any attention to the exigencies of evolution. It is time to abolish this dual imbalance. Syntax certainly evolved, and evolution, equally certainly, determined what the properties of syntax should be.
Recently, Noam Chomsky observed that Aa problem for the biological sciences that is already far from trivial@ is Ahow can a system such as human language arise in the mind/brain, or for that matter, in the organic world, in which one seems not to find systems with anything like the basic properties of human language?@ Chomsky goes on to remark1 that Abiology and the brain sciences . . . as currently understood, do not provide any basis for what appear to be fairly well-established conclusions about language.@ The purpose of this book is, of course, to accept this challenge and try to provide just such a basis.
However, what has to be done may encounter initial resistance among some syntacticians who have rejected any attempt to determine syntactic theory that includes evidence from areas other than synchronic syntax.2 Such resistance is natural, even laudable, considering the nature of some attempts made by specialists in other fields, with little or no understanding of generative theory, to explain how language came to be. However, one should bear in mind a very early remark by Chomsky3, that Atheory is underdetermined by data@ and that, consequently, constraints that go beyond the empirical ones may be required to decide between competing theories.
Such constraints are usually regarded as purely theory-internal, involving considerations of economy, elegance, consistency, and the like. However, I think that (with all due caution) a case can be made for using evolutionary considerations as a further constraint. Most of us agree that syntax evolved somehow, and it would be strange indeed if the process of its evolution had no bearing on its current state. Granted that the process described in preceding chapters is only one of a number of possible scenarios (even if a good deal more specific than most), it should be at least interesting to see the extent to which that process can be shown to determine the nature of syntax. But of course any such enterprise must submit to one essential restriction: what is proposed must take a form fully compatible with the Awell-established conclusions@ to which Chomsky refers in the passage cited above.
Notes and References for the appendix
The Present Approach and the Minimalist Program
What is proposed here has some obvious affinities with Chomsky=s Minimalist Program.4 However, one must point out a significant difference that is an inevitable consequence of the way in which the two approaches developed.
An ongoing problem for generative grammar has been its ongoing nature. The theory has evolved, over the past forty years, much as anything else evolves: that is, by throwing up, at each stage, a number of varying possibilities, and allowing the environment (in the form of a body of highly critical scholars) to determine which should live and which should die. The choice in turn determines the population of the next generation, reformulating and on occasion reprioritizing the problems assumed to be crucial to the overall enterprise. While there can be little doubt that this process, as a whole, has yielded ever fitter (that is, better fitting) theories, it does have a down side. At each stage of development, the theory contains elements inherited from previous stages that it doesn=t really need (theoretical equivalents of the human appendix, so to speak) and that may be at variance with the spirit, if not the letter, of the current stage. The presence of these elements may in turn require adjustments that, in the long run, prove detrimental to the program as a whole. Some specific examples will appear later.
In the present approach, this problem is avoided by stating in advance the mechanisms that the theory will be allowed to incorporate, as well as certain mechanisms that will be disallowed on principle. For instance, while movement to both A- and A-bar positions is allowed, other forms of movement, such as those involving Larsonian shells5 and Pollockian expansions of IP structures,6 are barred. In the first place, both these strategies evolved as responses to problems involving serial order, and the solutions chosen were such as would preserve prior assumptions. Larsonian shells were motivated by the failure of c-command to account for a variety of structures,7 and were calculated to preserve the validity of c-command by generating structures that would exhibit theAproper@ c-command relationships. Yet c-command is not an empirical fact but itself a theoretical proposal, albeit a venerable one;8 moreover, it cannot be deduced from any deeper principle, but must be stipulated. Clearly then, if it is possible to account for c-command phenomena by a mechanism that can be deduced, rather than stipulated, and that is independently motivated, the entire rationale for Larsonian shells disappears.
A second reason lies in the operationAMerge@ which though it has become central in minimalist thinking, was absent from its earlier stages.9 Once Merge, or something like it, is adopted, then the necessity for assuming the independent existence of tree structures into which lexical items are inserted simply disappears. There emerges instead the possibility of explaining puzzling phenomena on the basis of this attachment process, rather than by manipulating the prefabricated tree structures so as to move lexical items into configurations that never appear in surface structures. Chomsky himself appears well aware that this is the direction in which theory should move: as he remarked in a recent interview, AYou will always do Merge if you can get away with it; you only do overt movement if there=s no other way for the derivation to converge.@10
A final reason for rejecting Larsonian and Pollockian solutions may carry less force with syntacticians, although I think it is equally legitimate. While there are good grounds for supposing that both A- and A-bar movements are neurologically real (in that both the extraction- and landing-site positions for such movements, as well as the movements themselves, must be tracked by the brain in the process of creating and comprehending sentences), the conceptual necessity of the positions and movements involved in Larsonian shells and Pollockian trees is much less well founded. While we cannot rule out the possibility that the brain does in fact go through all of the gyrations required by these as it creates and understands sentences, the assumption that it does not, that it does things in a more parsimonious manner, should surely be tested before resorting to more complex solutions.
The present framework, a sort of minimal minimalism, is extremely restrictive and limits itself to the following four mechanisms, plus procedures deductively derivable from these or resulting directly from their interaction:
A few remarks on each of these mechanisms may be helpful here.
A is argument structure, a mechanism that derives directly from the exaptation of the social calculus described in previous chapters. It results in the creation of what will be termedAArgument Domains.@ An argument domain resembles what has been described as a Complete Functional Complex;12 it may be formally defined as follows:
It follows from (1) that argument domains may be divided into two classes, minimal and maximal domains:
Consider the following examples:
In (4a), the whole sentence is both the minimal and maximal domain ofAJohn@ and AMary.@ The same applies in (4b, c), because in (4b) the subordinate clause is an obligatory argument of Atell@ and in (4c) the most deeply embedded clause is an obligatory argument of Aknow,@ whose argument domain is in turn an obligatory argument of Atell.@ In (4b) the minimal domain of ABill@ is the subordinate clause and in (4c), the most deeply embedded clause, but in both cases its maximal domain is the full sentence, because each clause is an obligatory argument of the next highest clause. However, both (4d) and (4e) contain clauses that are not obligatory arguments of the matrix verb. Consider (5):
As (5b) shows,Aknow@ can take a clause as one of its two obligatory arguments, but as (5a) shows, it can also take a noun-phrase, and as (5c) shows, it cannot take both. (4d) is analogous in structure to (5a): the fact that Aman@ has a complement clause attached to it has nothing to do with the argument requirements of Aknow.@ This must be the case even though it is assumed here that the Vergnaud/Kayne analysis of relative clauses is correct13, in other words that Athe man@ originates to the right of Acriticized@ in (4d). Thus in (4d), both the minimal and maximal domains of Ahe@ are the embedded (relative) clause, while the maximal and minimal domains of AJohn@ are still the entire sentence. In (4e), however, the minimal and maximal domains of AJohn@ are simply AJohn slept late@, because the rest of the sentence consists of an adjunct (not an argument of Aslept@). The minimal domain of Aalarm-clock@ is the nonfinite clause Ato wind his alarm-clock@, and its maximal domain Abecause he forgot to wind his alarm-clock.@
The above domain categories have obvious implications for such things as movement and binding, which will be dealt with at greater length in subsequent sections. As a first approximation, constituents may not be moved outside of their maximal domains, and, if bound, are bound in their minimal domains.
B is obligatory attachment, a mechanism that requires that every argument must be attached to a non-argument. A non-argument can be either of the two -N classes (Verb and Preposition) or any case-marker or other affix indicating, for example, topicality or thematic role (the present framework contains no principled distinction between bound and free morphemes). Consider (6):
Here there are two arguments (AAmerica@ and AColumbus@) licensed respectively by the verb Adiscover@ and a [+finite] INFL (in this case, the Past marking of Adiscover@). If we remove both licensing features in (6) (the verbal nature of Adiscover@ and the finiteness of INFL) as for instance happens when we convert (6) into a noun phrase, the two arguments must be licensed by other means:
NowAColumbus@ requires a genitive suffix to fulfill (B), and AAmerica@ requires a preposition. Note that these necessities are purely formal; there is no semantic impediment to understanding:
Now consider what is known asAExceptional Case-Marking,@14 as in (9):
Here, INFL is [Bfinite] and cannot license the attachment of AColumbus.@ However, because the verb Aexpect@ is not required here to license another argument, it licences AColumbus@ even though there is no verb-argument relationship between the two (AColumbus@ remains an argument of Adiscover@).
Note that because nothing stipulates that only one non-argument should be attached to each argument, there is no contradiction (in languages as diverse as Latin, Japanese, and Tagalog) in having all arguments licensed by their own affix, even those that (in English, Chinese or creole languages) would be licensed solely by verb or INFL. In other words, doubly or triply licensed arguments are not counterexamples to (B), because (B) states only a minimal condition.
Reference to creole languages brings up the point that these languages, which result from extremely unnatural forms of language contact where morphology is reduced to a minimum, still obey (B) without exception. Even where all appropriate grammatical morphemes have been lost, arguments not licensed by verb or INFL are still licensed by some non-argument; most commonly, lexical verbs are recruited to discharge non-argument functions, giving rise to the so-calledAserial verb constructions@ that are a common feature of creole languages. Thus in Sranan we find sentences like (10):
to renderAI cut bread with a knife@, or like (11):
to renderAI gave the book to him@ (in Sranan, an English-based creole, there are no reflexes of E. Ato@ or Awith@). The fact that Sranan and other creoles, emerging in a single generation from protolanguage-like pidgins in which constituents can appear randomly without any kind of licensing, immediately instal (B) in the absence of positive evidence is a clear indication that (B), although it may have originally arisen as an opportunistic strategy to assist parsing, now forms part of the human genome.
Thus (B) ties what has traditionally been referred to asAgovernment@ to the process of human evolution.
C, binary attachment, is perhaps the single most important mechanism in the present framework, because order of attachment turns out to be crucial in matters of binding, coreference, and scope. For some time, binary branching15 has been generally assumed as the most restrictive hypothesis; it forms, of course, the basis for the Merge process which is central to the Minimalist Program. The termAattachment@ is used here because it seems to more accurately describe the process. Chomsky himself16 has pointed out that the product of AMerge@ could potentially be labeled as the intersection of α and β, the union of α and β, or one or the other of α, β; he concludes that the third possibility is the correct one. However, the term AMerge@ is more appropriate for either of the first two processes; in fact, either α is attached to β, and the product is β, or β is attached to α, and the product is α (that is, modifiers are attached to heads, rather than vice versa).
Another difference between Attach and Merge lies in the actual node labeling. In the source just cited, Chomsky labels the product of a merged noun (Abook@) and determiner (Athe@) as Athe@ (equivalent to DP in other frameworks) rather than Abook@ (equivalent to NP). Indeed, since the work of Abney,17 the DP assumption has been fairly standard. However, the main motivation for Abney=s proposal was identical to that of Larsonian shells and Pollockian trees: that is, to create additional spaces for constituents to be moved to. This in turn reflects the long-standing and seldom questioned assumption in generative grammar discussed above: that there exist abstract tree structures and that lexical items are attached to terminal nodes in those structures.
This assumption, although not yet banished by the Minimalist Program, is now unnecessary. Trees are simply built from the bottom up by successive binary attachment, and because there is no X-bar theory18, there are no constraints on the number or type of attachments beyond those imposed by the features in the lexical entries of the items to be attached. Movement is simply reattachment of an item at a higher level (see below for details of, and constraints on, this process); no prearrangedAlanding site@ is required. Thus the immediate motivations for Abney=s proposal no longer exist. For instance, the fact that in some languages possessive NPs and determiners can co-occur no longer requires that space be made for them. The solution lies in the lexicon. Lexical items may be specified with respect to whether or not they are phrase-final attachments B that is, whether attachment of such an item prevents further attachments to its phrase. In English, determiners are so specified, but not in all languages. Another of Abney=s problems, gerundive NPs, likewise disappears once requirements of X-bar theory no longer need to be met.
However, Merge and Attach are similar in that both require feature matching or checking in some form or other. In attachment, the requirements of targets (heads or projections of heads) and the feature specifications of the units to be attached to targets must correspond (other constraints on the process are supplied by (D), the hierarchy of thematic roles, to be dealt with in the following section). Consider, for example, the attachment of final arguments (equivalent toAexternal arguments@ or Asubjects@ in other frameworks; see next section for a detailed treatment of prior and final attachments) to nonfinite domains. This attachment is subject to the condition that the immediately prior attachment be one of Tense. If Tense is not present, the requirement of (B) for the final argument to attach to a nonargument cannot be satisfied. It can, however, be satisfied if the next attachment is to a non-argument that has not already been (and will not be, in any subsequent attachment) needed to fulfil another argument=s (B)-requirement. Consider the following examples:
In (12a)Awanted,@ a nonargument which has not saturated its capacity to fulfil (B), is attached to the node immediately dominating the previous attachment AJohn@; AJohn@ requires a nonargument, Awanted@ supplies it, and the attachment is licit. In (12b), on the other hand, Adesperately@ must attach to AJohn to leave,@ and Adesperately,@ an adverb, does not fall within the definition of non-argument given above; accordingly, attachment cannot proceed and the derivation fails. Again, in (12c), a permissible non-argument, the preposition Afor@, attaches to the node immediately dominating AJohn@, licensing attachment. However, if as in (12d) another argument, ABill@, intervenes between a potential nonargument (Apersuaded@) and AJohn@, the nonargument requirement of ABill@ saturates Apersuaded@ and an overt final attachment to the nonfinite domain is barred.
Argument Structure and Syntax
Let us turn now to D, the question of how thematic roles and the arguments that carry them get mapped onto a hierarchical structure, an issue that must be central to any theory seeking to demonstrate the evolution of syntax from argument structure. The process of attachment is crucially involved here; here, and indeed in all that follows, a guiding strategy will be to derive as much of the theory as possible from this process. Attachment, of its nature, is a serial, cumulative process, and the moves that compose it must follow a definite order. Here and elsewhere, the idea that order of attachment might play a significant role will be developed; in particular, it will be convenient to refer to attachment in terms of priority and finality, with respect to argument domains in general and minimal domains in particular. Thus attachment of X will be prior to attachment of Y iff X is attached to the tree before Y is attached. The attachment of X will be final iff X is the last argument of the verb of a given domain to be attached to that domain.
For reasons of space and simplicity, detailed discussion will be limited to English-specific mapping. We start with verbs that have only one obligatory argument. In English (and perhaps generally) the final attachment must be an obligatory argument. Accordingly, that argument (whether theme, experiencer, or some other category is immaterial) must attach to an otherwise completed domain (in the case of English, to the left of that domain). If there are two arguments, one is often agent; in the unmarked case, agent forms the final attachment and theme is attached directly to the verb, that is, prior to all other argument attachments.19 If one argument is experiencer and the other is theme,20 the ordering depends on whether the verb is lexically marked as [+causative]:
(13b) can be paraphrased asAGhosts cause Bill to be afraid@; the theme argument thus acquires causative (agentive) properties and is treated syntactically as if it indeed had the role of agent. In (13a), however, the verb will not take a causative meaning, so theme reverts to its normal (right of the verb) attachment.
Where there is a third (usually goal) argument, this attaches directly to the right of the verb, prior to the theme argument, which now attaches directly to the right of the verb-goal complex. Note that if this order is reversed, the goal argument requires a preposition:
This leads us directly to the question of how alternatives to the unmarked ordering discussed above are handled. Consider passives:
(16c) is usually explained by the inability of a participle such asAgiven@ to assign Case (thereby forcing movement of Aa book@) although in (16a) and (17):
no problem arises. But this does not account for the contrast between (i) and (ii) in footnote 17, repeated here for convenience:
Either we must assume two lexical entries for the verbAworry,@ with different subcategorization frames, or a more general process is at work.
The distribution of arguments noted above would follow automatically if there were a hierarchy of thematic roles in which positions in the hierarchy corresponded to positions in the ranking of prior and final attachments. In the latter, as we shall see, final attachments take precedence over all prior attachments, while prior attachments take precedence over subsequent attachments (excluding final attachments, naturally). This would suggest a thematic hierarchy of agent/Causer > goal > theme/experiencer,22 yielding agent/Causer as a final attachment, and goal as a first attachment prior to theme. (There appear to be no cases of verbs taking agent, goal, and experiencer as obligatory arguments.) Any variation in the positions determined by the hierarchy would then be signaled by movement of a lower argument to a higher position (lower in terms of priority/finality, that is) and the demotion of higher elements to prepositional phrases.23 In other words, if goal is promoted to final attachment (as in [16a]), theme moves into its vacated position while agent must be demoted to a PP, while if theme is promoted (as in [16b]), both agent and goal must be demoted to PPs.
Traditionally, movements such as those described above (A-movements) have been regarded as leaving gaps (empty categories, or ECs), as well as coindexed traces at the extraction site. There was always, however, something of a double standard involved. Objects moved and left gaps and traces (even though their position might still be case-marked and governed, as [16a] suggests) but subjects moved without leaving either a gap or a trace. Moreover, no one suggested that, in pairs like (18a,b), the object of (18a) moved to the subject position in (18b), leaving a gap (AJohn worries EC about politics@). Because A-movement involves for the most part24 a reshuffling of positions, with predictable consequences, it will be assumed here that (in contrast with A-bar movement) it leaves neither a gap nor a trace.
So far, optional arguments have not been mentioned. These, in English, are attached to the right of obligatory arguments, in no particular order (as [16b] shows, there are no ordering restrictions even on obligatory arguments once these have been demoted to prepositional phrases). The position of optional arguments in the order of attachment between what have been traditionally described asAinternal arguments@ and the Aexternal@ one is logical in terms of priority and finality; both internal arguments are prior to all optional arguments and (by finality) the exterior argument is also superior to them. (We shall see, in subsequent sections, how final and prior attachments command and control nonfinal, nonprior ones.)
Arguments, whether optional or obligatory, can of course themselves constitute argument domains without limit, in any position. If an argument is complex (a complex phrase or another domain) it is assembled completely, along the lines described, before being attached to the main structure.
With regard to linear ordering, this is assumed to follow directly from the hierarchical structuring of constituents. If Merge is purely hierarchical, and requires a subsequent ordering in the phonological component (as some treatments suppose),25 Attach is a concrete operation that specifies, for each attachment, the direction of attachment. It is therefore possible to take any syntactic tree and read off its terminal nodes, starting from the highest left-hand node and finishing with the furthest right-hand node; this gives the appropriate linear order.
The model presented here assumes, in accordance with a long-standing generative tradition that still continues,26 a model of movement consisting of three operations: (a) the insertion of the to-be-moved constituent in itsAexpected@ position (the position dictated by the mapping procedures described in the previous section); (b) the copying of the constituent at the site to which it is to be moved; (c) the deletion of the original insertion.
Parsing considerations suggest that in any viable language, constraints would have to be imposed upon movement, otherwise anything could turn up anywhere, and the search for antecedents would become too costly in terms of time and energy to permit the kind of rapid, automatic processing on which language relies. In general, movement causes constituents to appear as postfinal (in English, extreme left-hand) attachments to argument domains; that is to say, once a final argument has been attached, an argument already attached within the domain in question may be copied and attached in postfinal position. However, attachment for copies of nonfinal WH arguments cannot be made directly to the attachment node of the final argument,27 but requires the presence of a tensed auxiliary verb; attachment for copies of final arguments (as in , below) lacks this requirement, because deletion of the original constituent allows the copy to attach to nonargument INFL.
As indicated by (19)-(21), movement can be to the left margin of a minimal domain (19-20) or a maximal domain (21). In (22), it might be objected thatAthe boy@ and Athe boy Mary saw@ cannot be copies, since their reference is distinct. However, their reference is not distinct at the time the constituent is copied. The string (23) B
Bexists before attachment to AI know@ (that is, before the stage represented by [22b]) and at this stage there are simply two copies of Athe boy@; the argument Athe boy Mary saw@ does not exist until after attachment and after the deletion of the copied constituent. At the stage represented by (23), the string could just as easily become a complete sentence:
As has been known since the 1960s, there are a number of restrictions on the extent of movement. For example, movement cannot occur out of the following constituents, even when the sentences would be fully grammatical without movement (in each case, movement is supposed to have originated at EC):
Finite adjunct clauses:
Movement to the left margin of the matrix domain is impossible in all these cases because in none of them is the full sentence a maximal domain. In each, the minimal domain of the EC is its maximal domain. In (25),Asomeone that knew EC/Bill@ is an argument, but not a minimal domain; the minimal domain is AEC that knew EC/Bill,@ where the first EC is the extraction site for Asomeone.@ Similarly in (26a), Athe rumor that Mary liked EC@ is an argument, but not a minimal domain; the minimal domain is AMary liked EC.@ In (27), there are clearly two minimal domains linked by a conjunction, and in (28) the minimal domain of the EC is not an argument of Aworried.@ In other words, movement is only possible within maximal domains B and, as (29) shows, not always then.
Here, the embedded sentence is an argument of the matrix verb but its postfinal attachment position is already filled byAwho.@ Postfinal attachment of an argument copy (or anything else) to a minimal domain Acloses@ that domain, making it inaccessible to further extractions.
However, sentences such as the following, which seems to allow exit from a closed domain, are frequently cited in the literature:28
Examples like this are supposed to demonstrate that a complement can move across an adjunct but an adjunct can=t move across a complement (the second EC in (30a) represents the extraction site of Awhat,@ while the second is supposed to represent the extraction site of Ahow@; the first EC is of course PRO, coindexed with Ayou@). However, there is good reason to suppose that (30) has been misanalyzed:
(31a) shows that withAhow@ in situ (i.e. no crossing at all) the sentence is worse than (30a). The contrast between the two Asurprise@ questions (31b,c) shows that the reason uniting all three ungrammatical cases is that Awonder@ requires a WH-word (or Aif@) as complementizer. If Ahow@ is a complementizer, it is not a argument extracted from its minimal domain, therefore not an obstacle to the movement of Awhat@ to the beginning of the sentence.
A further environment where movement is barred is that of sentential subjects. Although it has long been a goal of generative grammar to subsume all barriers to movement under a single mechanism, it remains an empirical question whether or not this is possible or even desirable. Its desirability may look obvious, but the relative parsimony of grammars should be judged not on whether they provide a single explanation for what has been traditionally regarded as a single phenomenon, but on which grammar requires the fewest principles and the least stipulation. In (32), the extraction site clearly lies within an argument of the matrix verb:
However, accounts of movement do not often note that
are just as bad as (32), even though (35) is fine:
Thus (32) does not represent an asymmetry between extraction from subject and extraction from complement, but rather a condition on attachment. The lexical specification of interrogativeAdo@ must include the proviso that it cannot attach unless it can access, by (D), a terminal node representing an argument attachment. However, in (32)-(34), the only terminal node accessible to Ado@ is the one where Athat,@ a nonargument, is attached. Note that in (35), (D) allows the auxiliary to access the node where AJohn,@ an argument of the domain AJohn=s reading (of) Homer,@ is attached.
Another environment in which movement is apparently blocked involves the so-calledAthat@-trace effect:
An enormous literature has been written about this effect, and many proposals have been advanced to account for it, most of them trying in some way to subsume the effect under some independently existing barrier to movement. However, a number of languages fail to show the effect. Indeed, sentences may be ungrammatical without the complementizer:
Several writers have drawn attention to these problems29 and attributed them to differences in the properties of the relevant morphemes in the different languages. In terms of the present framework, some factive complementizers require that the attachment immediately prior to their own have phonetic content, while others do not; it is worth noting here that SpanishAque,@ in contrast to Athat,@ will also introduce nonfinites with null subjects:
It would seem therefore that barriers to movement arise from two distinct causes, attachment conditions and membership of minimal domains that either do not form arguments of maximal domains or have been closed by postfinal attachment. Though the conditions are disjoint, they are certainly no more complex than any other proposal for constraining movement. They have the further advantage that they invoke no more than the basic conditions of attachment and domain formation that underlie so many other aspects of the present account.
Determining Reference of Empty Categories
Movement has, as an incidental consequence, the creation of a large number of empty categories. At the same time, certain A-positions may be forced to remain unfilled because there is no available non-argument they can attach to. However, all empty categories have to receive reference from somewhere (with the exception ofAgeneric@ complements such as those in  and cases of indefinite reference such as that in ):
Because empty categories are featureless and lack independent reference, and because part of the evidence for assigning reference to nonreferential constituents consists of matching their features with those of potential antecedents (we know, for instance, thatAhe@ requires a masculine, singular, human antecedent), how do hearers determine the reference of empty categories?
In fact, there is a simple procedure, based largely on the order-of-attachment featuresAprior@ and Afinal,@ which enables hearers to automatically determine the reference of empty categories, even where two or more empty categories occur within the same sentence. This procedure begins by determining whether the sentence contains A-bar constituents. If there is more than one such constituent, it takes the most deeply embedded constituent and determines whether there is a nonfinally attached EC. If there is, the A-bar constituent identifies with it; if not, it seeks an EC finally attached to a tensed clause. If there is one, it identifies with that; if not, it seeks an EC that lacks a prior attachment outside its minimal domain and identifies with that. The process is repeated (if necessary) until all A-bar constituents have been identified. Any ECs that have not yet been identified are then identified with an immediately prior attachment outside their minimal domain, if there is such an attachment, otherwise with the first final attachment outside their minimal domain. Any remaining ECs should receive a generic or indefinite interpretation.
The process can be illustrated by the following examples:
In (42) the only A-bar constituent isAwho.@ The first two ECs are final attachments to untensed clauses, but the third is a nonfinal attachment; therefore Awho@ is identified with it. The other two ECs both have immediately prior attachments outside their minimal domains: because AEC to call EC@ is attached to Atell Mary@ after the attachment of AMary@ to Atell,@ and because AEC to tell Mary EC to call EC@ is attached to Aask Sally@ immediately after the attachment of ASally@ to Aask,@ AMary@ and ASally@ respectively are the immediate prior attachments (and accordingly, the antecedents) of the second and first ECs.
Again,Awho@ is the only A-bar constituent. Here there is no nonfinal EC, and of the two ECs, only the first is in a tensed domain, so Awho@ is identified with the first EC. The second EC has no immediately prior attachment, because AEC to see Bill@ is attached directly to Awanted.@ The first final attachment outside the second EC=s minimal domain is the first EC (a nearer antecedent than Ayou,@ the only other possibility) so Awho@ and both ECs co-refer.
Here there are three constituents in A-bar positions,Awho,@ Athe boy,@ and Asomeone@ (AThe boy Bill saw EC@ is, of course, in an A-position with respect to the matrix clause, but with respect to the minimal domain ABill saw EC@, Athe boy@ is in an A-bar position; similar considerations apply to Asomeone.@)
We begin withAsomeone@, the most deeply embedded, and find within its minimal domain a nonfinal EC (the fifth and last) with which it identifies. Similarly, Athe boy@ identifies with the first EC, also nonfinal. This leaves three ECs, of which one, that which follows Apersuade@, is nonfinal. Accordingly, Awho@ identifies with it. The same EC is attached to Apersuade@ prior to the attachment of AEC to find someone,@ so the two ECs share reference with Awho.@ Finally consider the fourth EC, Asubject@ of Awork for.@
ASomeone,@ the nearest potential antecedent, cannot corefer with it, because Asomeone@ is a postfinal, rather than a final or prior, attachment to the EC=s minimal domain. Accordingly, the fourth EC identifies with the third EC, the first final attachment outside its domain.
On the surface, (45a, b) differ by only a single word; however, the pattern of identification of the ECs differs sharply. In (45a),Awho@ again identifies with the third, nonfinal EC. In (45a). however, Astay,@ in contrast with Asee,@ has only one argument, so that there is no nonfinal EC B and for that matter, no final EC in a tensed domain. AWho@ must therefore identify with one of the untensed final ECs; because the second EC has a prior attachment AMary@ to identify with (in both sentences), Awho@ in (45b) identifies with the first EC. However, the first EC in (45a) identifies with Ayou,@ the first final attachment outside its domain.
Binding, Scope, and C-command
Treatment of the areas covered so far has been unavoidably brief, omitting many nontrivial issues and leaving others without adequate discussion. In a single appendix to a single volume, it is obviously impossible to do justice to the wealth of both empirical data and theoretical argumentation that have emerged during the last four decades in the study of syntax. However, in order to show what the present framework is capable of, it seems desirable to take a single area and explore it in somewhat greater depth.
The area involving binding and scope is one that has been central to syntactic theory over at least the last two decades. Around binding theory in particular, a vast and controversial literature has accumulated, in part because of extensive cross-linguistic variation.30 A variety of solutions have been proposed, and most treatments have found themselves obliged, at one stage or another, to handle phenomena within the same general area in rather different ways. However, the present framework is able to handle the entire area in a unitary manner; moreover, the mechanisms required to handle it are not (unlike other approaches) crafted specifically to fit the area involved. All that is required to handle problems of scope and binding is contained in notions that already form an inescapable part of the theory: those same notions of priority and finality of attachment that we have already seen at work in other areas of the grammar.
In its classic form,31 binding theory describes the conditions that identify constituents without independent referenceB pronouns and anaphors (reflexives, reciprocals, and the like) B by assigning mutually exclusive domains to these two classes.
But not all languages have both pronouns and anaphors: for instance, in at least one dialect of Haitian,32 there is only one non-independently referential item in the third-person singular:
Languages with just two classes, pronoun and anaphor, tend to adopt an English-type distribution, with anaphors referring within their minimal domain, pronouns outside it. But there exist languages such as Greek or Icelandic with more than two classes. For instance Greek has an itemAo idhios,@ literally Athe same,@ which occurs within the maximal domain of its antecedent but, unlike the reflexive form Aton eafton tou,@ does not have that antecedent within its minimal domain:33
Note that a literal, idiomatic translation of (48) is impossible because English lacks a comparable feature.
In other words, languages keep the domains of dependent items as separate as possible, and they do so by exploiting the categories (argument domain, minimal domain, maximal domain) created by argument structure. But precisely how each language exploits those categories depends on interactions with other factors, especially the number of dependent referential items the language happens to have.
However, perhaps the most critical problem in binding theory lies in determining not so much the domains within which binding is (im)possible as the circumstances under which constituents in any domain can be bound. A relationship between binding and c-command (varyingly defined, but the differences between definitions will not concern us here) has been assumed almost without question for more than two decades. However, in 1986 a squib by Barss and Lasnik34 showed that, in double-object sentences such as (49a):
none of the structures hitherto proposed gave the right configuration forABill@ to c-command Ahimself@; indeed, in the most popular and natural of these structures, Ahimself@ c-commanded ABill.@ A number of solutions for this rather serious problem have been proposed, among them Larsonian shells5 and Pesetsky=s Acascades,@35 all of which have been designed to provide the Aright@ configuration for c-command to operate.
An attachment framework is free to take a quite different approach. In fact, what has been calledAc-command@ falls out from finality of attachment. Consider the following:
The pattern of grammaticality in these sentences is normally attributed to the fact that the anaphor is c-commanded byABill@ (50a) or ABill=s sister@ (50b), whereas in (50c), the first branching node that dominates ABill=s@ does not dominate Ahimself.@ However, it is equally true that ABill@ and ABill=s sister@ are final attachments in (50a) and (50b), and accordingly bind anaphors in the minimal domains to which they attach. In (50c), on the other hand, ABill=s@ is not a final attachment B ABill=s@ must be attached to Asister@ before ABill=s sister@ can be attached to Awas pleased . . .@ B so ABill=s@ cannot bind the anaphor.
Condition C in classical binding theory (AReferential expressions are free@), generally attributed to c-command, can also be shown to depend on order of attachment. Consider (51):
In (51a),AJohn@ and Ahe@ can co-refer; in (51b), they cannot. The reason is that addition of a final attachment to a minimal domain closes that domain, rendering it ineligible for many operations. For instance, while an item lacking independent reference that lies within a closed minimal domain may (subject to other factors) still be able to get reference from an antecedent in its maximal domain, the reverse is not possible: reference cannot flow outwards and upwards from an antecedent within the closed minimal domain, to a pronoun or anaphor in a maximal domain.
However, none of the above explains the Barss-Lasnik data. For this we must turn to the second basic principle of attachment, priority. In (49),ABill@ is attached first to Ashowed,@ and then Ahimself@ is attached to Ashowed Bill@; ABill@ is therefore a prior attachment with respect to Ahimself@ and within the same minimal domain, therefore binding it by virtue of that priority and domain membership. In (52), however;
Ahimself@ is attached prior to ABill@ and therefore cannot bind it, even though, in this case, ABill@ c-commands Ahimself@ (on the most straightforward structural assumptions).
Now consider the following examples:
Examples (53)-(55) show that anaphoric binding, quantifier scope, and negative scope (all of which supposedly involve c-command) follow an identical pattern in double-object cases; in all of these cases, a prior attachment binds a subsequent one.
In examples (53)-(55), goals (datives or indirect objects) precede themes (direct objects). It might be suggested that some thematic hierarchy, such as that referred to earlier by Jackendoff, was operative (goals can bind themes, but not vice versa); den Dikken, although he does not adopt a thematic solution, still finds himself obliged to treat goal-theme and theme-goal constructions differently,36 because theme-goal examples show an identical pattern to that of (53)-(55):
However, the present framework can provide a unified treatment for (53)-(58). In every case, regardless whether it is goal or theme, direct or indirect object, the first argument attached binds the second, according to the principle of priority binding.
Two possible alternatives can be quickly disposed of. Although in the cases discussed so far, an initial attachment binds a subsequent attachment, the operative principle is priority rather than initiality, as (59) shows:
c) Bill sent reminders about John and Mary=s birthdays to each other.
d) ?*Bill sent reminders about each other=s birthdays to John andMary.
In (59), the initial attachment is of courseAreminders,@ ruling out initiality as the binding factor. Although (59d) is marginally better than (59b), the overall pattern is the same: prior attachments bind subsequent ones, rather than vice versa.
Another alternative that has proved popular over the years37 has been that of linear precedence: the argument that precedes in linear structure binds the argument that follows. However, this proposal has been rightly rejected by most syntacticians, because syntactic relations are hierarchical rather than linear.
Moreover, the predictions of linear precedence, as well as those of c-command, are violated by cases of so-calledAbackwards@ anaphora, where antecedents that neither precede nor command anaphors still bind them.38
Again, either priority or finality of attachment accounts for these cases.ABill@ in (60a) is attached prior to Ahimself.@ In (60b), ABill@ is part, but not the whole, of the final attachment, and therefore cannot bind; a similar reason disallows (60c), where ABill@ is part, but not the whole, of a prior attachment.
Cases like those of (60) are sometimes regarded as limited to experiencer predicates. However, as Pesetsky points out,39 any verb whose subject has a causative (but non-agentive) subject gives similar results:
Moreover, passive sentences behave similarly:
Of course, with passives it is always possible to posit something likeAreconstruction at LF@ to account for examples like (62). A similar course is possible in some, but not all cleft sentences, for example (63a), but not (63b).
While even on conservative analysesABill@ can c-command Ahimself@ in (63a) at some stage of the derivation, it is impossible for ABill@ to c-command Ahimself@ without a nearer antecedent, AMary,@ also c-commanding it, as in (63c). Thus while c-command will not work without some otherwise-unmotivated reconstruction, priority of attachment within the same domain accounts for all of examples (61)-(63).
Reconstruction, too, fails to give an adequate account of bound variable pronouns. Take the following, for instance:
In order to accommodate examples such as (65b), Chomsky40 proposed aALeftness Principle@ based on linear order:
However, as Huang41 points out, even this formulation will not accommodate examples like (67):
Neither (65) nor (67a) presents a problem for the present framework. Bound variable pronouns, like quantifiers, negatives, and anaphors, obey the principles of priority and finality. In (65b),Aeveryone@ is a prior attachment and Ahim@ a constituent of a final attachment, but the situation is not analogous either to that of (60a), where an anaphor in a final attachment is properly bound by a prior attachment, or (51a), where reference flows from a maximal into a minimal domain that is an argument of the maximal domain. (65b) differs from (60a) in that in (65a), the minimal domain of the pronoun is closed by a final attachment (Athe woman@) and differs from (51a) in that it is not a case of reference flowing from a maximal domain into a minimal domain. The sentence consists of two minimal domains, AThe woman who left him@ and A[The woman who left him] loved everyone,@ with constituents inside the square brackets inaccessible to further operations. This is because the verb Aleft@ does not require its final attachment to be sentential (for that matter, no verb does!) so that, no minimal-maximal relationship exists, and the combination of closedness and disjointness of the domains prevents Ahim@ from being bound by Aeveryone.@
Example (67b) is slightly different. Here the antecedent is in the final attachment and the pronoun is an initial attachment (therefore prior to all other attachments). But this configuration merely adds a third factor to the closedness and disjointness of domains that prevented binding in (65b). While final attachments can bind,@every man@ forms only part of the final attachment to the minimal domain; this handicap alone would suffice to prevent binding (cf. example [50c] above). Thus although examples like (65) and (67) cannot be handled within other frameworks without extensions of core binding theory, the present framework involves no additional mechanisms.
Note that where priority and finality both obtain, neither overrides the other:
AHimself@ in (68) can be either AJohn@ or ABill.@ Still more strikingly, consider (69):
In (69) any of the three argumentsAJohn.@ ABill@ and ASam@ can bind the anaphor. ASam@ binds by virtue of priority of attachment. ABill@ also binds by priority of attachment, since it is attached to Atold@ before Athat pictures of himself . . .@ is attached to Atold Bill.@ AJohn@ binds by virtue of finality; although it is not the final attachment to the minimal domain of Ahimself.@ However, because Ahimself@ is itself part of the final attachment to its own minimal domain (thereby excluding the possibility of final attachment binding in that domain), it can be bound by the next final attachment AJohn.@
Let us now consider what might seem to be counterexamples to prior and final binding:
Why does (70b) fail prior binding? Because complex noun clauses whose heads derive from verbs (plus some that do not, likeApicture@ phrases, which mimic them) behave exactly like finite or nonfinite clauses: they are argument domains. (70c) is clearly related to AJohn analyzed himself.@ If this is the case, then AJohn=s@ is clearly the final attachment of a minimal domain, one moreover that is not required by the verb to be sentential. Attachment of a final argument to such a domain closes that domain, preventing coreference between other domain members and outside antecedents. In (70a), on the other hand, the minimal domain Aanalysis of herself@ is not closed by a final attachment; that is, the final argument represented here by the Possessor AJohn=s,@ has no equivalent, leaving the domain open to prior binding.
It should be noted in this context that complex derived NPs mirror verb-centered argument domains in every respect, showing results in quantifier scope and negative scope as well as in the binding of anaphors:42
Priority binding handles all these examples without appeal to any (otherwise unmotivated) mechanism. Note further that, because only the attachment of completed arguments, rather than those arguments= internal structure, counts in the present framework, the fact that many of these are PPs rather than NPs or DPs (which creates problems for any analysis relying on c-command) is quite irrelevant here.
A rather different type of structure is exhibited in (74):
At first sight, this seems like a clear violation of priority. A subsequent attachment appears to bind a prior one, while the predicted configuration fails to bind. However, the two argumentsAa fear of X and Y@ and AX and Y=s greatest problem,@ are not in fact arguments of Aconsider,@ but arguments of a small clause, which in turn is a reduction of finite and nonfinite clauses of similar meaning:
In (74a), therefore,AJohn and Mary=s greatest problem@ attaches first to a copula with zero phonetic representation,43 and accordingly has priority over the final attachment Aa fear of each other.@ In (74b), in contrast, the anaphor attaches first, and even though AJohn and Mary@ occurs in the final attachment, it is not the whole of that attachment, therefore cannot bind through finality. Because no other potential binder is available, the sentence is ungrammatical.
A different type of problem is presented by sentences like (76):44
Here, a prior attachment seems to be bound by a subsequent one. But there are two things to be taken into consideration here. First, the structure of (76) is identical to that of (56b), which many writers,45 including myself, regard as ungrammatical. Second, in (76) (as for that matter in [56b]), the anaphor is not a prior attachment, but a part of such an attachment (it is attached toAfriends@ in , and to Astudents@ in [56b], before being attached to the main structure of the sentence). Thus such cases are clearly marginal as compared with cases in which the anaphor, as well as a prior attachment, is itself an argument (e.g., , which is uncontroversially ungrammatical). Clearly something is going on here that has to do with differences between prior and final attachments, and between attachment within an argument as specifier as opposed to attachment as complement. This needs to be clarified by further research (see discussion of doubtful cases at the end of this section).
A case where varying informant judgements would seem not to be involved concerns the contrast between (77a) and (77b):
While (77a) is dubious, the contrast between (77a, b) seems clear enough.AEach other@ binding suggests a cline of acceptability from sentences like (61a) through sentences like (77a) to sentences like (77b). It would seem here that a semantic differential, involving a progressively stronger degree of agency, is involved, as well perhaps as the specifier/complement distinction referred to in the previous paragraph. It may be that final attachments to final attachments have a status different from that of complements to final attachments; again, only further investigation will clarify the issue.
Many dubious cases involve sentences in which both antecedent and anaphor are non-arguments. Where both are arguments, results are always clear. When one is an argument and the other a nonargument, results are almost always clear, save for the kind of semantic issues shown in (77). When both are nonarguments, however, judgements can much more readily be influenced by semantic or even pragmatic factors, which can override the normal inability of a nonargument to bind an anaphor:
b) Attorneys for the twins thought each other=s alibis were phoney.
The two sentences are identical in structure, but in (78a), the natural reading is that the relatives thought each other=s stories were hilarious, rather than that relatives of the groom thought the bride=s stories hilarious and relatives of the bride thought the groom=s stories hilarious. However, in the very different context of (78b), it is perhaps more plausible to suppose that the attorney for twin A thought twin B=s alibi was phoney while the attorney for twin B thought twin A=s alibi was phoney, rather than that the two attorneys thought each other=s alibis were phoney.
A similar example is (79):
Because stories do not have friends,Aeach other@ can more readily be taken to corefer with ABill and Mary.@ However, part of the problem lies in the very different statuses of Aself@ anaphors and Aeach other.@ The distribution of these as nonarguments is quite different; Ahimself@ can only be a prior attachment (Comp) while Aeach other@ can be either a final attachment (Spec) or a prior attachment. This may be connected with the fact that while there are pronominal equivalents for all Aself@ anaphors, there is no pronominal (that is, no nonreferential item free in its own domain) that corresponds with Aeach other.@ In consequence, Aeach other@ sentences (provided there is a possible plural antecedent somewhere in the sentence) are likelier, other things being equal, to be judged grammatical than Aself@ sentences.
However, insofar as apparent counterexamples to the present proposal are either supportive of it (when correctly analyzed) or involve marginal cases influenced by semantic considerations, we can conclude that the principles of priority and finality give a coverage of scope and binding problems that is surprisingly broad, especially considering the restrictiveness of the mechanisms invoked.
Accordingly, we may state conditions on binding as follows:
In the foregoing sections, several of the main aspects of syntax have been discussed, including argument structure, government, the mapping of argument structure to phrase structure, movement, barriers to movement, empty categories, binding and co-reference, scope, and c-command. It has proved possible to account for major features in all these areas through the minimal primitives (A) through (D). The fact that this is possible, and that a grammar arising from research into the evolution of language combines such extensive coverage with such extremely restrictive principles, strongly suggests that language evolution did indeed follow the paths that we have proposed in the body of this book.
there's still more....
Notes and References for the appendix
Copyright ©2000 by
The handheld, traditionally comfortable