Cognitive Psychology

12

Language Structure

What makes the human species special? There are two basic hypotheses about

why people are intellectually different from other species. In the past few chapters,

I indulged my favorite theory, which is that we have unmatched abilities to solve

problems and reason about our world, owing in large part to the enormous development

of our prefrontal cortices. However, there is another theory at least as popular in cognitive

science, which is that humans are special because they alone possess a language.

This chapter and the next will analyze in more detail what language is, how people

process language, and what makes human language so special. This chapter will focus

primarily on the nature of language in general, whereas the next chapter will contain

more detailed analyses of how language is processed. We will consider some of the

basic linguistic ideas about the structure of language and evidence for the psychological

reality of these ideas, as well as research and speculation about the relation between

language and thought. We will also look at the research on language acquisition. Much

of the evidence both for and against claims about the uniqueness of human language

comes from research on the way in which children learn the structure of language.

In this chapter, we will answer the questions: • What does the field of linguistics tell us about how language is processed? • What distinguishes human language from the communication systems of other

species? • How does language influence the nature of human thought? • How are children able to acquire a language?

•Language and the Brain

The human brain has features strongly associated with language. For almost all

of the 92% of people who are right-handed, language is strongly lateralized in

the left hemisphere. About half of the 8% of people who are left-handed still

have language left lateralized. So 96% of the population has language largely

Anderson7e_Chapter_12.qxd 8/20/09 9:52 AM Page 322

Language and the Brain | 323

in the left hemisphere. Findings from studies with split-brain patients (see

Chapter 1) have indicated that the right hemisphere has only the most rudimentary

language abilities. It was once thought that the left hemisphere was

larger, particularly in areas taking part in language processing, and that this

greater size accounted for the greater linguistic abilities associated with the left

hemisphere. However, neuroimaging techniques have suggested that the differences

in size are negligible, and researchers are now looking to see whether

there are differences in neural connectivity or organization (Gazzaniga, Ivry, &

Mangun, 2002) in the left hemisphere. It remains largely a mystery what differences

between the left and the right hemispheres could account for why language

is so strongly left lateralized.

Certain regions of the left hemisphere are specialized for language, and

these are illustrated in Figure 12.1. These areas were initially identified in studies

of patients who suffered aphasias (losses of language function) as a consequence

of stroke. The first such area was discovered by Paul Broca, the French surgeon

who, in 1861, examined the brain of such a patient after the patient’s death (the

brain is still preserved in a Paris museum). This patient was basically incapable

of spoken speech, although he understood much of what was spoken to him.

He had a large region of damage in a prefrontal area that came to be known

as Broca’s area. As can be seen in Figure 12.1, it is next to the motor region that

controls the mouth. Shortly thereafter, Carl Wernicke, a German physician,

identified patients with severe deficits in understanding speech who had damage

in a region in the superior temporal cortex posterior to the primary auditory

cortex. This area came to be known as Wernicke’s area. Parietal regions

close to Wernicke’s area (the supramarginal gyrus and angular gyrus) also have

also been found to be important to language.

Two of the classic aphasias, now known as Broca’s aphasia and Wernicke’s

aphasia, are associated with damage to these two regions. Chapter 1 gave

Brain Structures

Broca’s area

Wernicke’s area

Supramarginal gyrus

Angular gyrus

Motor face area

Primary auditory area

FIGURE 12.1 A lateral view of

the left hemisphere. Some of

the brain areas implicated in

language are in boldface type.

(From Dronkers, Redfern, & Knight, 2000.)

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 323

examples of the kinds of speech problems suffered by patients with these two

aphasias. The severity of the damage determines whether patients with Broca’s

aphasia will be unable to generate almost any speech (like Broca’s original

patient) or be capable of generating meaningful but ungrammatical speech.

Patients with Wernicke’s aphasia, in addition to having problems with comprehension,

sometimes produce grammatical but meaningless speech. Another

kind of aphasias is conduction aphasia; in this condition, patients suffer difficulty

in repeating speech and have problems producing spontaneous speech.

Conduction aphasia is sometimes associated with damage to the parietal regions

shown in Figure 12.1.

Although the importance of these left-cortical areas to speech is well documented

and there are many well-studied cases of aphasia resulting from damage

in these regions, it has become increasingly apparent that there is no simple

mapping of damaged areas onto types of aphasia. Current research has focused

on more detailed analyses of the deficits and of the regions damaged in each

aphasic patient.

Although there is much to understand, it is a fact that human evolution and

development have selected certain left-cortical regions as the preferred locations

for language. It is not the case, however, that language has to be left lateralized.

There are those left-handers who have language in the right hemisphere, and

young children who suffer left-brain damage may develop language in the right

hemisphere, in regions that are homologous to those depicted in Figure 12.1 for

the left hemisphere.

Language is preferentially localized in the left hemisphere in prefrontal

regions (Broca’s area), temporal regions (Wernicke’s area), and parietal

regions (supramarginal and angular gyri).

•The Field of Linguistics

The academic field of linguistics attempts to characterize the nature of language.

It is distinct from psychology in that it studies the structure of natural

languages rather than the way in which people process natural languages.

Despite this difference, the work from linguistics has been extremely influential

in the psychology of language. As we will see, concepts from linguistics play

an important role in theories of language processing. As noted in Chapter 1, the

influence from linguistics was important to the decline of behaviorism and the

rise of modern cognitive psychology.

Productivity and Regularity

The linguist focuses on two aspects of language: its productivity and its

regularity. The term productivity refers to the fact that an infinite number of

utterances are possible in any language. Regularity refers to the fact that these

utterances are systematic in many ways. We need not seek far to convince

324 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 324

ourselves of the highly productive and creative character of language. Pick a

random sentence from this book or any other book of your choice and enter it

as an exact string (quoting it) in Google. If Google can find the sentence in all

of its billions of pages, it will probably either be from a copy of the book or a

quote from the book. In fact, these sorts of methods are used by programs to

catch plagiarism.Most sentences you will find in books were created only once in

human history. And yet it is important to realize that the components that make

up sentences are quite small in number: English uses only 26 letters, 40 phonemes

(see the discussion in the Speech Recognition section of Chapter 2), and some

tens of thousands of words. Nevertheless, with these components, we can and do

generate trillions of novel sentences.

A look at the structure of sentences makes clear why this productivity is

possible. Natural language has facilities for endlessly embedding structures within

structures and coordinating structures with structures. A mildly amusing party

game starts with a simple sentence and requires participants to keep adding to

the sentence:

• The girl hit the boy. • The girl hit the boy and he cried. • The big girl hit the boy and he cried. • The big girl hit the boy and he cried loudly. • The big girl hit the boy who was misbehaving and he cried loudly. • The big girl with authoritarian instincts hit the boy who was misbehaving

and he cried loudly.

And so on until someone can no longer extend the sentence.

The fact that an infinite number of word strings can be generated would not

be particularly interesting in itself. If we have tens of thousands of words for

each position and if sentences can be of any length, it is not hard to see that

a very large (in fact, an infinite) number of word strings is possible. However,

if we merely combine words at random, we get “sentences” such as • From runners physicians prescribing miss a states joy rests what thought

most.

In fact, very few of the possible word combinations are acceptable sentences.

The speculation is often jokingly made that, given enough monkeys working at

typewriters for a long enough time, some monkey will type a best-selling book.

It should be clear that it would take a lot of monkeys a long time to type just

one acceptable *R@!#s.

So, balanced against the productivity of language is its highly regular character.

One goal of linguistics is to discover a set of rules that will account for

both the productivity and the regularity of natural language.

Such a set of rules is referred to as a grammar. A grammar should be able

to prescribe or generate all the acceptable utterances of a language and be able

to reject all the unacceptable sentences in the language. A grammar consists

of three types of rules—syntactic, semantic, and phonological. Syntax concerns

The Field of Linguistics | 325

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 325

word order and inflection. Consider the following examples of sentences that

violate syntax: • The girls hits the boys. • Did hit the girl the boys? • The girl hit a boys. • The boys were hit the girl.

These sentences are fairly meaningful but contain some mistakes in word combinations

or word forms.

Semantics concerns the meaning of sentences. Consider the following sentences

that contain semantic violations, even though the words are correct in

form and syntactic position: • Colorless green ideas sleep furiously. • Sincerity frightened the cat.

These constructions are called anomalous sentences in that they are syntactically

well formed but nonsensical.

Phonology concerns the sound structure of sentences. Sentences can be correct

syntactically and semantically but be mispronounced. Such sentences are

said to contain phonological violations. Consider this example:

The Inspector opened his notebook. “Your name is Halcock, is’t no?” he began.

The butler corrected him. “H’alcock,” he said, reprovingly. “H, a, double-l?”

suggested the Inspector. “There is no h’aich in the name, young man. H’ay is

the first letter, and there is h’only one h’ell.” (Sayers, 1968, p. 73)

The butler, wanting to hide his cockney dialect, which drops the letter h, is systematically

mispronouncing every word that begins with a vowel.

The goal of linguistics is to discover a set of rules that captures the structural

regularities in a language.

Linguistic Intuitions

A major goal of linguistics is to explain the linguistic intuitions of speakers of a

language. Linguistic intuitions are judgments about the nature of linguistic utterances

or about the relations between linguistic utterances. Speakers of the language

are often able to make these judgments without knowing how they do so.

As such, linguistic intuition is another example of implicit knowledge, a concept

introduced in Chapter 7. Among these linguistic intuitions are judgments about

whether sentences are ill-formed and, if ill-formed, why. For instance, we can

judge that some sentences are ill-formed because they have bad syntactic structure

and that other sentences are ill-formed because they lack meaning. Linguists require

that a grammar capture this distinction and clearly express the reasons for it.

Another kind of intuition is about paraphrase.A speaker of English will judge that

the following two sentences are similar in meaning and hence are paraphrases: • The girl hit the boy. • The boy was hit by the girl.

326 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 326

Yet another kind of intuition is about ambiguity. The following sentence has

two meanings: • They are cooking apples.

This sentence can either mean that some people are cooking some apples or

that the apples can be used for cooking.Moreover, speakers of the language can

distinguish this type of ambiguity, which is called structural ambiguity, from

lexical ambiguity, as in • I am going to the bank.

where bank can refer either to a monetary institution or to a riverbank. Lexical

ambiguities arise when a word has two or more distinct meanings; structural

ambiguities arise when an entire phrase or sentence has two or more meanings.

Linguists try to account for the intuitions we have about paraphrases, ambiguity,

and the well-formedness of sentences.

Competence versus Performance

Our everyday use of language does not always correspond to the prescriptions

of linguistic theory. We generate sentences in conversation that, upon reflection,

we would judge to be ill-formed and unacceptable. We hesitate, repeat

ourselves, stutter, and make slips of the tongue. We misunderstand the meaning

of sentences. We hear sentences that are ambiguous but do not note their

ambiguity.

Another complication is that linguistic intuitions are not always clear-cut.

For instance, we find the linguist Lakoff (1971) telling us that, in the following

case, the first sentence is not acceptable but the second sentence is: • Tell John where the concert’s this afternoon. • Tell John that the concert’s this afternoon.

People are not always reliable in their judgments of such sentences and certainly

do not always agree with Lakoff.

Considerations about the unreliability of human linguistic behavior and

judgment led linguist Noam Chomsky (1965) to make a distinction between

linguistic competence, a person’s abstract knowledge of the language, and

linguistic performance, the actual application of that knowledge in speaking

or listening. In Chomsky’s view, the linguist’s task is to develop a theory of

competence; the psychologist’s task is to develop a theory of performance.

The exact relation between a theory of competence and a theory of performance

is unclear and can be the subject of heated debates. Chomsky has

argued that a theory of competence is central to performance—that our

linguistic competence underlies our ability to use language, if indirectly.

Others believe that the concept of linguistic competence is based on a rather

unnatural activity (making linguistic judgments) and has very little to do with

language use.

Linguistic performance does not always correspond to linguistic competence.

The Field of Linguistics | 327

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 327

•Syntactic Formalisms

A major contribution of linguistics to the psychological study of language has

been to provide a set of concepts for describing the structure of language. The

most frequently used ideas from linguistics concern descriptions of the syntactic

structure of language.

Phrase Structure

A great deal of emphasis in linguistics has been given to understanding the syntax

of natural language. One central linguistic concept is phrase structure.

Phrase-structure analysis is not only significant in linguistics, but also important

to an understanding of language processing. Therefore, coverage of this

topic here is partly a preparation for material in the next chapter. Those of you

who have had a certain kind of training in high-school English will find the

analysis of phrase structure to be similar to parsing exercises.

The phrase structure of a sentence is the hierarchical division of the sentence

into units called phrases. Consider this sentence: • The brave dog saved the drowning child.

If asked to divide this sentence into two major parts in the most natural way,

most people would provide the following division: • (The brave dog) (saved the drowning child).

The parentheses distinguish the two separate parts. The two parts of the

sentence correspond to what are traditionally called subject and predicate or

noun phrase and verb phrase. If asked to divide the second part, the verb phrase,

further, most people would give • (The brave dog) (saved [the drowning child]).

Often, analysis of a sentence is represented as an upside-down tree, as

in Figure 12.2. In this phrase-structure tree, sentence points to its subunits, the

328 | Language Structure

The brave dog saved the downing child.

Sentence

Verb pharse

Article Adj Noun Verb Noun phrase

Article Adj Noun

Noun phrase

FIGURE 12.2 An example of the phrase structure of a sentence. The tree structure illustrates

the hierarchical division of the sentence into phrases.

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 328

noun phrase and the verb phrase, and each of these units points to its subunits.

Eventually, the branches of the tree terminate in the individual words. Such treestructure

representations are common in linguistics. In fact, the term phrase

structure is often used to refer to such tree structures.

An analysis of phrase structure can point up structural ambiguities. Consider

again the sentence • They are cooking apples.

Whether cooking is part of the verb with are or part of the noun phrase with apples

determines the meaning of the sentence. Figure 12.3 illustrates the phrase

structure for these two interpretations. In Figure 12.3a, cooking is part of the

verb, whereas in Figure 12.3b, it is part of the noun phrase.

Phrase-structure analysis is concerned with the way that sentences are broken

up into linguistic units.

Pause Structure in Speech

Abundant evidence supports the argument that phrase structures play a key

role in the generation of sentences.1 When a person produces a sentence, he or

she tends to generate it a phrase at a time, pausing at the boundaries between

large phrase units. For instance, although no tape recorders were available in

Lincoln’s time, one might guess that he produced the first sentence of “The

Gettysburg Address” with brief pauses at the end of each of the major phrases

as follows:

Four score and seven years ago (pause)

our forefathers brought forth on this continent (pause)

Syntactic Formalisms | 329

(a)

They are cooking apples.

Sentence

Noun phrase Verb phrase

Noun phrase

Aux Noun

Verb

Verb

(b)

They are cooking apples.

Pronoun

Sentence

Noun phrase

Noun phrase

Adj Noun

Verb

Verb phrase

Pronoun

FIGURE 12.3 The phrase structures illustrating the two possible meanings of the ambiguous

sentence They are cooking apples: (a) that those people (they) are cooking apples; (b) that

those apples are for cooking.

1 In Chapter 13, we will examine the role of phrase structures in language comprehension.

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 329

a new nation (pause)

conceived in liberty (pause)

and dedicated to the proposition (pause)

that all men are created equal (pause)

Although Lincoln’s speeches are not available for auditory analysis, Boomer

(1965) analyzed examples of spontaneous speech and found that pauses did

occur more frequently at junctures between major phrases and that these

pauses were longer than pauses at other locations. The average pause time

between major phrases was 1.03 s, whereas the average pause within phrases

was 0.75 s. This finding suggests that speakers tend to produce sentences a

phrase at a time and often need to pause after one phrase to plan the next.

Other researchers (Cooper & Paccia-Cooper, 1980; Grosjean, Grosjean, &

Lane, 1979) looked at participants producing prepared sentences rather than

spontaneous speech. The pauses of such participants tend to be much shorter,

about 0.2 s. Still, the same pattern holds, with longer pauses at the major

phrase boundaries.

As Figures 12.2 and 12.3 illustrate, there are multiple levels of phrases

within phrases within phrases. What level do speakers choose for breaking up

their sentences into pause units? Gee and Grosjean (1983) argued that speakers

tend to choose the smallest level above the word that bundles together coherent

semantic information. In English, this level tends to be noun phrases (e.g., the

young woman), verbs plus pronouns (e.g., will have been reading it), and

prepositional phrases (e.g., in the house).

People tend to pause briefly after each meaningful unit of speech.

Speech Errors

Other research has found evidence for phrase structure by looking at errors in

speech. Maclay and Osgood (1959) analyzed spontaneous recordings of speech

and found a number of speech errors that suggested that phrases do have a psychological

reality. They found that, when speakers repeated themselves or corrected

themselves, they tended to repeat or correct a whole phrase. For instance,

the following kind of repeat is found: • Turn on the heater/the heater switch.

and the following pair constitutes a common type of correction: • Turn on the stove/the heater switch.

In the preceding example, the noun phrase is repeated. In contrast, speakers do

not produce repetitions in which part, but not all, of the verb phrase is repeated,

such as • Turn on the stove/on the heater switch.

Other kinds of speech errors also provide evidence for the psychological reality

of constituents as major units of speech generation. For instance, some research

has analyzed slips of the tongue in speech (Fromkin, 1971, 1973; Garrett, 1975).

330 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 330

One kind of speech error is called a spoonerism, after the English clergyman

William A. Spooner to whom are attributed some colossal and clever errors of

speech. Among the errors of speech attributed to Spooner are:

• You have hissed all my mystery lectures. • I saw you fight a liar in the back quad; in fact, you have tasted the whole

worm. • I assure you the insanitary spectre has seen all the bathrooms. • Easier for a camel to go through the knee of an idol. • The Lord is a shoving leopard to his flock. • Take the flea of my cat and heave it at the louse of my mother-in-law.

As illustrated here, spoonerisms consist of exchanges of sound between words.

There is some reason to suspect that the preceding errors were deliberate

attempts at humor by Spooner. However, people do generate genuine spoonerisms,

although they are seldom as funny.

By patient collecting, researchers have gathered a large set of errors made by

friends and colleagues. Some of these errors are simple sound anticipations and

some are sound exchanges as in spoonerisms:

• Take my bike →bake my bike [an anticipation] • night life →nife lite [an exchange] • beast of burden →burst of beaden [an exchange]

One that gives me particular difficulty is

• coin toss →toin coss

The first error in the preceding list is an example of an anticipation, where

an early phoneme is changed to a later phoneme. The others are examples of

exchanges in which two phonemes switch. The interesting feature about these

kinds of errors is that they tend to occur within a single phrase rather than

across phrases. So, we are unlikely to find an anticipation, like the following,

which occurs between subject and object noun phrases: • The dancer took my bike.→The bancer took my dike.

Also unlikely are sound exchanges where an exchange occurs between the initial

prepositional phrase and the final noun phrase, as in the following: • At night John lost his life.→At nife John lost his lite.

Garrett (1990) distinguished between errors in simple sounds and those in

whole words. Sound errors occur at what he called the positional level, which

basically corresponds to a single phrase, whereas word errors occur at what he

called the functional level, which corresponds to a larger unit of speech such as

a full clause. Thus, the following word error has been observed:

• That kid’s mouse makes a great toy.→That kid’s toy makes a great

mouse.

whereas the following sound error would be unlikely: • That kid’s mouse makes a great toy.→That kid’s touse makes a great moy.

Syntactic Formalisms | 331

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 331

In Garrett’s (1980) corpus, 83% of all word exchanges extended beyond phrase

boundaries, but only 13% of sound errors did. Word and sound errors are

generally thought to occur at different levels in the speech-production process.

Words are inserted into the speech plan at a higher level of planning, and so a

larger distance is possible for the substitution.

An experimental procedure has been developed for artificially producing

Spoonerisms in the laboratory (Baars,Motley, & MacKay, 1975; Motley, Camden,

& Baars, 1982). This involves presenting a series of word pairs like

Big Dog

Bad Deal

Beer Drum

**Darn Bore**

House Coat

Whale Watch

and asking the participants to speak certain words such as the asterisked Darn

Bore in the above series. When they have been primed with a series of word

pairs with the opposite order of first consonants (the preceding three all are

B—— D——), they show a tendency to reverse the order of the first consonants,

in this case producing Barn Door. Interestingly, participants are much

more likely to produce such an error if it produces real words, as it does in the

above case, than if it does not (as in the case of Dock Boat, which if reversed

would become Bock Doat). Participants are also sensitive to a host of other

facts such as whether the pair is grammatically appropriate and whether it is

culturally appropriate (e.g., they are more likely to convert cast part into past

cart than they are to convert fast part into past fart). This research has been

taken as evidence that we combine multiple factors into selection of speech

items.

Speech errors involving substitutions of sounds and words suggest that words

are selected at the clause level, whereas sounds are inserted at a lower phrase

level.

Transformations

A phrase structure describes a sentence hierarchically as pieces within larger

pieces. There are certain types of linguistic constructions that some linguists

think violate this strictly hierarchical structure. Consider the following pair of

sentences:

1. The dog is chasing Bill down the street.

2. Whom is the dog chasing down the street?

In sentence 1, Bill, the object of the chasing, is part of the verb phrase. On the

other hand, in sentence 2, whom, the object of the verb phrase, is at the beginning

of the sentence. The object is no longer part of the verb-phrase structure

332 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 332

to which it would seem to belong. Some linguists have proposed that, formally,

such questions are generated by starting with a phrase structure that has the

object whom in the verb phrase, such as

3. The dog is chasing whom down the street?

This sentence is somewhat strange but, with the right questioning intonation

of the whom, it can be made to sound reasonable. In some languages, such

as Japanese, the interrogative pronoun is normally in the verb phrase, as in

sentence 3. However, in English, the proposal is that there is a movement transformation

that moves the whom into its more normal position. Note that this

proposal is a linguistic one concerning the formal structure of language and

may not describe the actual process of producing the question.

Some linguists believe that a satisfactory analysis of language requires such

transformations, which move elements from one part of the sentence to another

part. Transformations can also operate on more complicated sentences.

For instance, we can apply it to sentences of the form

4. John believes that the dog is chasing Bill down the street.

The corresponding question forms are

5. John believes that the dog is chasing whom down the street?

6. Whom does John believe that the dog is chasing down the street?

Sentence 5 is strange even with a questioning intonation for whom, but still

some linguists believe that sentence 6 is transformationally derived from it,

even though we would never produce sentence 5.

An intriguing concern to linguists is that there seem to be real limitations

on just what things can be moved by transformations. For instance, consider

the following set of sentences:

7. John believes the fact that the dog is chasing Bill down the street.

8. John believes the fact that the dog is chasing whom down the street?

9. Whom does John believe the fact that the dog is chasing down the street?

As sentence 7 illustrates, the basic sentence form is acceptable, but one cannot

move whom from question form 8 to produce question form 9. Sentence 9

just sounds bizarre. We will return later to the restrictions on movement

transformations.

In contrast with the abundant evidence for phrase structure in language

processing, the evidence that people actually compute anything analogous to

transformations in understanding or producing sentences is very poor. How

people process such transformationally derived sentences remains very much

an open question. It is also the case that there is a lot of controversy within linguistics

about how to conceive of transformations. The role of transformations

has been deemphasized in many proposals.

Transformations move elements from their normal positions in the phrase

structure of a sentence.

Syntactic Formalisms | 333

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 333

•What Is So Special about Human Language?

We have reviewed some of the features of human language, with the implicit

assumption that no other species has anything like such a language.What gives

us this conceit? How do we know that other species do not have their own

languages? Perhaps we just do not understand the languages of other species.

Certainly, all social species communicate with one another and, ultimately,

whether we call their communication systems languages is a definitional matter.

However, human language is fundamentally different than these other systems,

and it is worth identifying some of the features (Hockett, 1960) that are considered

critical to human language.

Semanticity and arbitrariness of units. Consider, for instance, the communication

system of dogs. They have a nonverbal system that is very effective in

communication. The reason that dogs are such successful pets is thought to be

that their nonverbal communication system is so much like that of humans.

Besides being nonverbal, canine communication has more fundamental limitations.

Unlike human language, in which the relation between signs and meaning

is arbitrary (there is no reason why “Good dog” and “Bad dog” should mean

what they do), dogs’ signs are directly related to means—a snarl for aggression

(which often reveals the dog’s sharp incisors), exposing the neck (a vulnerable

part of the dog’s body) for submission, and so on. However, although canines

have a nonarbitrary communication system, it is not the case that all species do.

For instance, the vocalizations of some species of monkeys have this property of

arbitrary meaning (Marler, 1967). One species, the Vervet Monkey, has different

warning calls for different types of predators—a “chutter” for snakes, a “chirp”

for leopards, and a “kraup” for eagles.

Displacement in time and space. A critical feature of the monkey warning

system is that the monkeys use it only in the presence of a danger. They do not

use it to “discuss” the day’s events at a later time. An enormously important

feature of human language (exemplified by this book) is that it can be used to

communicate over time and distance. Interestingly, the “language” of honeybees

satisfies the properties of both arbitrariness and displacement (von Frisch,

1967). When a honeybee returns to a nest after finding a food source, it will

engage in a dance to communicate the location of the food source. The “dance”

consists of a straight run followed by a turn to the right to circle back to the

starting point, another straight run, followed by a turn and circle to the left, and

so on, in an alternating pattern. The length of the run indicates the distance of

the food and the direction of the run relative to vertical indicates the direction

relative to the sun.

Discreteness and productivity. Language contains discrete units, which

would serve to disqualify the bee language system, although the monkey warning

system meets this criterion. Requiring a language to have discrete units is

not just an arbitrary regulation to disqualify the dance of the bees. This discreteness

enables the elements of the language to be combined into an almost

infinite number of phrase structures and for these phrase structures to be

transformed, as already described. As will become more and more apparent

334 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 334

in these chapters, this ability to combine symbols makes human language different

from the communication systems of all other species.

It is a striking fact that all people in the world, even those in isolated communities,

speak a language. No other species, not even genetically close apes,

spontaneously use a communication system anything like human language.

However, many people have wondered whether apes such as chimpanzees could

be taught a language. Early in the 20th century, there were attempts to teach

chimpanzees to speak that failed miserably (C. Hayes, 1951; Kellogg & Kellogg,

1933). It is now clear that the human vocal apparatus has undergone special

evolutionary adaptations to enable speech, and it was a hopeless goal to try to

teach chimps to speak. However, apes have considerable manual dexterity and,

more recently, there have been some well-publicized attempts to teach chimpanzees

and other apes manual languages.

Some of the studies have used American sign language (e.g., Gardner &

Gardner, 1969), which is a full-fledged language and makes the point that

language need not be spoken. These attempts were only modest successes

(e.g., Terrace, Pettito, Sanders, & Bever, 1979). Although the chimpanzees could

acquire vocabularies of more than a hundred signs, they never used them with

the productivity typical of humans in using their own language. Some of the

more impressive attempts have actually used artificial languages consisting of

“words” called lexigrams, made from plastic shapes, that can be attached to a

magnetic board (e.g., Premack & Premack, 1983).

Perhaps the most impressive example comes from a bonobo great ape called

Kanzi (Savage-Rumbaugh et al., 1993; see Figure 12.4). Bonobos are considered

even closer genetically to humans than chimpanzees are, but are rare. Kanzi’s

mother was a subject of one of these efforts, and Kanzi simply came along with

his mother and observed her training sessions. However, he spontaneously

What Is So Special about Human Language? | 335

FIGURE 12.4 Kanzi, a bonobo,

listening to English. A number of

videos of Kanzi can be found on

YouTube by searching with his

name. (The Language Research Center.)

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 335

336 | Language Structure

language deadens their basic nature and that the real issue

is that humans have lost the ability to understand apes.

The very similarity of primates to humans is what

makes them such attractive subjects for research. There

are severe restrictions on research on

apes in many countries, and in 2008 The

Great Ape Protection Act that would

have prohibited any invasive research

involving great apes was introduced in

the U.S. Congress. Much of the concern

is with use of apes to study human

disease, where the potential benefits are

great but the moral issues of infecting

an animal are also severe. From this

perspective, most cognitive research with

apes, such as that on language acquisition,

is quite benign. From a cognitive

perspective, they are the only creatures

that have thought processes close to

that of humans and they offer potential

insights we cannot get from other species. Nonetheless,

many have argued that all research that removes them

from their natural setting, including language acquisition

research, should be banned.

Implications

Ape language and the ethics of experimentation

The issue of whether apes can be taught human languages

interlinks in complex ways with issues about the ethical

treatment of animals in research. The philosopher Descartes

believed that language was what separated humans from

animals. According to this view, if apes

could be shown capable of acquiring a

language, they would have human status

and should be given the same rights as

humans in experimentation. One might

even ask that they give informed consent

before participating in an experiment.

Certainly, any procedure that involved

injury would not be acceptable. There has

been a fair amount of research involving

invasive brain procedures with primates,

but most of this has involved monkeys,

not the great apes. Interestingly, it has

been reported that studies with linguistic

apes found that they categorized themselves

with humans and separate from

other animals (Linden, 1974). It has been argued that it is

in the best interests of apes to teach them a language

because this would confer on them the rights of humans.

However, others have argued that teaching an ape a human

started to use the lexigrams, and the experimenters began working with their

newfound subject. His spontaneous constructions were quite impressive, and it

was discovered that he had also acquired a considerable ability to understand

spoken language. When he was 5.5 years of age, his comprehension of spoken

English was determined to be equivalent to that of a 2-year-old human.

As in other things, it seems unwise to conclude that human linguistic abilities

are totally discontinuous from the abilities of genetically close primates.

However, the human propensity for language is remarkable in the animal

world. Steven Pinker (1994b) coined the phrase “language instinct” to describe

the propensity for every human to acquire language. In his view, it is something

wired into the human brain through evolution. Just as song birds are born with

the propensity to learn the song of their species, so we are born with the

propensity to learn the language of our society. Just as humans might try to

imitate the song of birds and partly succeed, other species, like the bonobo,

may partly succeed at mastering the language of humans. However, bird song

is special to songbirds and language is special to humans.

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 336

Only humans show the propensity or the ability to acquire a complex

communication system that combines symbols in a multitude of ways like

natural language.

•The Relation Between Language and Thought

All reasonable people would concede that there is some special connection between

language and humans. However, there is a lot of controversy about why

there is such a connection. Many researchers, like Steven Pinker and Noam

Chomsky, believe that humans have some special genetic endowment that enables

them to learn language. However, others argue that what is special is general

human intellectual abilities and that these abilities enable us to shape our

communication system to be something as complex as natural language. I confess

to leaning toward this alternate viewpoint. It raises the question of what

might be the relation between language and thought. There are three possibilities

that have been considered:

1. Thought depends in various ways on language.

2. Language depends in various ways on thought.

3. They are two independent systems.

We will go through each of these ideas in turn, starting with the proposal that

language depends on thought. There have been a number of different versions

of this proposal including the radical behaviorist proposal that thought is just

speech and a more modest proposal called linguistic determinism.

The Behaviorist Proposal

As discussed in Chapter 1, John B.Watson, the father of behaviorism, held that

there was no such thing as internal mental activity at all. All that humans do,

Watson argued, is to emit responses that have been conditioned to various stimuli.

This radical proposal, which, as noted in Chapter 1, held sway in America for

some time, seemed to fly in the face of the abundant evidence that humans

can engage in thinking behavior (e.g., do mental arithmetic) that entails no

response emission. To deal with this obvious counter, Watson proposed that

thinking was just subvocal speech—that, when people were engaged in such

“thinking” activities, they were really talking to themselves. Hence, Watson’s

proposal was that a very important component of thought is simply subvocal

speech. (The philosopher Herbert Feigl once said that Watson “made up his

windpipe that he had no mind.”)

Watson’s proposal was a stimulus for a research program that engaged in

taking recordings to see whether evidence could be found for subvocal activity

of the speech apparatus during thinking. Indeed, often when a participant is

engaged in thought, it is possible to get recordings of subvocal speech activity.

However, the more important observation is that, in some situations, people

engage in various silent thinking tasks with no detectable vocal activity. This

The Relation Between Language and Thought | 337

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 337

finding did not upset Watson. He claimed that we think with our whole bodies—

for instance, with our arms. He cited the fascinating evidence that deaf mutes

actually make signs while asleep. (Speaking people who have done a lot of communication

in sign language also sign while sleeping.)

The decisive experiment addressing Watson’s hypothesis was performed by

Smith, Brown, Toman, and Goodman (1947). They used a curare derivative

that paralyzes the entire voluntary musculature. Smith was the participant for

the experiment and had to be kept alive by means of an artificial respirator.

Because his entire musculature was completely paralyzed, it was impossible for

him to engage in subvocal speech or any other body movement. Nonetheless,

under curare, Smith was able to observe what was going on around him, comprehend

speech, remember these events, and think about them. Thus, it seems

clear that thinking can proceed in the absence of any muscle activity. For our

current purposes, the relevant additional observation is that thought is not just

implicit speech but is truly an internal, nonmotor activity.

Additional evidence that thought is not to be equated with language comes

from the research on memory for meaning that was reviewed in Chapter 5.

There, we considered the fact that people tend to retain not the exact words of a

linguistic communication, but rather a more abstract representation of the

meaning of the communication. Thought might be identified, at least in part,

with this abstract, nonverbal propositional code. As mentioned there in regard

to the perceptual symbol system hypothesis, the abstractness of human thought

is under reconsideration. However, even the perceptual symbol system proposal

holds that thought is more than just subvocal speech and that, rather, it consists

of rich internal perceptual representations.

Additional evidence that thought is more than subvocal speech comes from

the occasional person who has no apparent language at all but who certainly

gives evidence of being able to think. Additionally, it seems hard to claim that

nonverbal animals such as apes are unable to think. Recall, for instance, the

problem-solving exploits of Sultan in Chapter 8. It is always hard to determine

the exact character of the “thought processes” of nonverbal participants and

the way in which these processes differ from the thought processes of verbal

participants, because there is no language with which nonverbal participants

can be interrogated. Thus, the apparent dependence of thought on language

may be an illusion that derives from the fact that it is hard to obtain evidence

about thought without using language.

The behaviorists believed that thought consists only of covert speech and

other implicit motor actions, but evidence has shown that thought can

proceed in the absence of any motor activity.

The Whorfian Hypothesis of Linguistic Determinism

Linguistic determinism is the claim that language determines or strongly

influences the way that a person thinks or perceives the world. This proposal is

much weaker than Watson’s position because it does not claim that language

338 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 338

and thought are identical. The hypothesis has been advanced by a good many

linguists but has been most strongly associated with Whorf (1956). Whorf was

quite an unusual character himself. He was trained as a chemical engineer at

MIT, spent his life working for the Hartford Fire Insurance Company, and studied

North American Indian languages as a hobby. He was very impressed by the

fact that different languages emphasize in their structure rather different aspects

of the world. He believed that these emphases in a language must have a

great influence on the way that speakers of that language think about the world.

For instance, he claimed that Eskimos have many different words for snow, each

of which refers to snow in a different state (wind-driven, packed, slushy, and so

on), whereas English speakers have only a single word for snow.2 Many other

examples exist at the vocabulary level: The Hanunoo people in the Philippines

supposedly have 92 different names for varieties of rice. The Arabic language

has many different ways of naming camels. Whorf felt that such a rich variety

of terms would cause the speaker of the language to perceive the world differently

from a person who had only a single word for a particular category.

Deciding how to evaluate the Whorfian hypothesis is very tricky. Nobody

would be surprised to learn that Eskimos know more about snow than average

English speakers. After all, snow is a more important part of their life experience.

The question is whether their language has any effect on the Eskimos’ perception

of snow beyond the effect of experience. If speakers of English went through the

Eskimo life experience, would their perception of snow be any different from

that of the Eskimo-language speakers? (Indeed, ski bums have a life experience

that includes a great deal of exposure to snow; they have a great deal of knowledge

about snow and, interestingly, have developed new terms for snow.)

One fairly well-researched test of the issue uses color words. English has 11

basic color words—black, white, red, green, yellow, blue, brown, purple, pink,

orange, and gray—a large number. These words are called basic color words

because they are short and are used frequently, in contrast with such terms as

saffron, turquoise, and magenta. At the other extreme is the language of the

Dani, a Stone Age agricultural people of Indonesian New Guinea. This language

has just two basic color terms: mili for dark, cold hues and mola for bright,

warm hues. If the categories in language determine perception, the Dani should

perceive color in a less refined manner than English speakers do. The relevant

question is whether this speculation is true.

Speakers of English, at least, judge a certain color within the range referred

to by each basic color term to be the best—for instance, the best red, the best

blue, and so on (see Berlin & Kay, 1969). Each of the 11 basic color terms in

English appears to have one generally agreed upon best color, called a focal

color. English speakers find it easier to process and remember focal colors than

nonfocal colors (e.g., Brown & Lenneberg, 1954). The interesting question is

whether the special cognitive capacity for identifying focal colors developed

The Relation Between Language and Thought | 339

2 There have been challenges to Whorf ’s claims about the richness of Eskimo vocabulary for snow

(L.Martin, 1986; Pullman, 1989). In general, there is a feeling that Whorf exaggerated the variety of words

in various languages.

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 339

because English speakers have special words for these colors. If so, it would be a

case of language influencing thought.

To test whether the special processing of focal colors was an instance of

language influencing thought, Rosch (who published some of this work under

her former name, Heider) performed an important series of experiments on the

Dani. The point was to see whether the Dani processed focal colors differently

from English speakers. One experiment (Rosch, 1973) compared Dani and

English speakers’ ability to learn nonsense names for focal colors with that for

nonfocal colors. English speakers find it easier to learn arbitrary names for focal

colors. Dani participants also found it easier to learn arbitrary names for

focal colors than for nonfocal colors, even though they have no names for these

colors. In another experiment (Heider, 1972), participants were shown a color

chip for 5 s; 30 s after the presentation ended, they were required to select the

color from among 160 color chips. Both English and Dani speakers perform

better at this task when they are trying to locate a focal color chip rather than a

nonfocal color chip. The physiology of color vision suggests that many of these

focal colors are specially processed by the visual system (de Valois & Jacobs,

1968). The fact that many languages develop basic color terms for just these

colors can be seen as an instance of thought determining language.3

However, more recent research by Roberson, Davies, and Davidoff (2000)

does suggest an influence of language on ability to remember colors. They

compared British participants with another Papua New Guinea group who

speak Berinmo, a language that has five basic color terms. Color Plate 12.1

compares how the Berinmo cut up the color space with how English speakers

cut up the color space. Replicating the earlier work, they found that there was

superior memory for focal colors regardless of language. However, there were

substantial effects of the color boundaries as well. The researchers examined

distinctions that were important in one language versus another. For instance,

the Berinmo make a distinction between the colors wor and nol in the middle

of the English green category, whereas English speakers make their yellow-green

distinction in the middle of the Berinmo wor category.

Participants from both languages were asked

to learn to sort stimuli at these two boundaries into

two categories. Figure 12.5 shows the amount of

effort that the two populations put into learning the

two distinctions. English speakers found it easiest to

sort stimuli at the yellow-green boundary, whereas

Berinmo found it easiest to sort stimuli at the nol-wor

distinction.

Note that both populations are capable of making

distinctions that are important to the other

population. Thus, it is not that their language has

made them blind to color distinctions. However,

they definitely find it harder to see the distinctions

340 | Language Structure

Nol-Wor

Categories

Mean trials to criterion

Yellow-Green

Berinmo

English

FIGURE 12.5 Mean errors to

criterion for the two populations

learning distinctions at the

nol-wor boundary and at

the yellow-green boundary.

(From Roberson et al., 2000.)

3 For further research on this topic, read Lucy and Shweder (1979, 1988) and Garro (1986).

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 340

not signaled in their language and learn to make them consistently. Thus,

although language does not completely determine how we see the color space,

it does have an influence.

Language can influence thought, but it does not totally determine the types

of concepts that we can think about.

Does Language Depend on Thought?

The alternative possibility is that the structure of language is determined by the

structure of thought. Aristotle argued 2500 years ago that the categories of

thought determined the categories of language. There are some reasons for

believing that he was correct, but most of these reasons were not available to

Aristotle. So, although the hypothesis has been around for 2500 years, we have

better evidence today.

There are numerous reasons to suppose that humans’ ability to think (i.e.,

to engage in nonlinguistic cognitive activity such as remembering and problem

solving) appeared earlier evolutionarily and occurs sooner developmentally than

the ability to use language. Many species of animals without language appear to

be capable of complex cognition. Children, before they are effective at using

their language, give clear evidence of relatively complex cognition. If we accept

the idea that thought evolved before language, it seems natural to suppose that

language arose as a tool whose function was to communicate thought. It is generally

true that tools are shaped to fit the objects on which they must operate.

Analogously, it seems reasonable to suppose that language has been shaped to

fit the thoughts that it must communicate.

We saw in Chapter 5 that propositional structures constitute a very important

type of knowledge structure in representing information both derived

from language and derived from pictures. This propositional structure is

manifested in the phrase structure of language. The basic phrase units of a

language tend to convey propositions. For instance, the tall boy conveys the

proposition that the boy is tall. This phenomenon itself—the existence of

a linguistic structure, the phrase, designed to accommodate a thought structure,

the proposition—seems to be a clear example of the dependence of

language on thought.

Another example of the way in which thought shapes language comes from

Rosch’s research on focal colors. As stated earlier, the human visual system is

maximally sensitive to certain colors. As a consequence, languages have special,

short, high-frequency words with which to designate these colors. Thus, the

visual system has determined how the English language divides up the color

space.

We find additional evidence for the influence of thought on language when

we consider word order. Every language has a preferred word order for expressing

subject (S), verb (V), and object (O). Consider this sentence, which exhibits

the preferred word order in English:

• Lynne petted the Labrador.

The Relation Between Language and Thought | 341

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 341

English is referred to as an SVO language. In a study of a diverse sample of

the world’s languages, Greenberg (1963) found that only four of the six possible

orders of S, V, and O are used in natural languages, and one of these four orders

is rare. The six possible word orders and the frequency of each order in the

world’s languages are as follows (the percentages are from Ultan, 1969):

SOV 44% VOS 2%

SVO 35% OVS 0%

VSO 19% OSV 0%

The important feature is that the subject almost always precedes the object.

This order makes good sense when we think about cognition. An action starts

with the agent and then affects the object. It is natural therefore that the subject

of a sentence, when it reflects its agency, is first.

In many ways, the structure of language corresponds to the structure of how

our minds process the world.

Modularity of Language

We have considered the possibility that thought might depend on language and

the possibility that language might depend on thought. A third logical possibility

is that language and thought might be independent. A special version of this

independence principle is called the modularity position (Chomsky, 1980;

Fodor, 1983). This position holds that important language processes function

independently from the rest of cognition. Fodor argued that a separate linguistic

module first analyzes incoming speech and then passes this analysis on

to general cognition. Fodor thought that this linguistic module was similar in

this respect to early visual processing, which largely proceeds in response to

the visual stimulus independent of higher-level intentions.4 Similarly, in language

generation, the linguistic module takes the intentions to be spoken and

produces the speech. This position does not deny that the linguistic module

may have been shaped to communicate thought. However, it argues that it

operates according to different principles from the rest of cognition and is

“encapsulated” such that it cannot be influenced by general cognition. In essence,

the claim is that language’s communication with other mental processes is

limited to passing its products to general cognition and receiving the products

of general cognition.

One piece of evidence for the independence of language from other cognitive

processes comes from research on people who have substantial deficits in

language but not in general cognition or vice versa. Williams syndrome, a rare

genetic disorder, is an example of a mental retardation that seems not to affect

linguistic fluency (Bellugi,Wang, & Jernigan, 1994). On the other side, there are

people who have severe language deficits without accompanying intellectual

deficits, including both some aphasics and some with developmental problems.

342 | Language Structure

4 However, as reviewed in Chapter 3, there are some effects of visual attention in primary visual cortex—for

example, see the discussion of Figure 3.10.

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 342

Specific language impairment (SLI) is a term used to describe a pattern of deficit

in the development of language that cannot be explained by hearing loss, mental

retardation, or other nonlinguistic factors. It is a diagnosis of exclusion and

probably has a number of underlying causes; in some cases, these causes appear

to be genetic (Stromswold, 2000). Recently, a mutation in a specific gene, called

FOXP2, has been associated with specific language deficits in the popular press

(e.g.,Wade, 2003), although there appear to be other cognitive deficits associated

with this mutation as well (Vargha-Khadem, Watkins, Alcock, Fletcher, &

Passingham, 1995). The FOXP2 gene is very similar in all mammals, although

the human FOXP2 is distinguished from that of other primates by two amino

acids (out of 715). Mutations in the FOXP2 gene are associated with vocal

deficits and other deficits in many species. For instance, it results in incomplete

acquisition of song imitation in birds (Haesler et al., 2007). It has been claimed

that the human form of the FOXP2 gene became established in the human population

about 50 thousand years ago when, according to some proposals, human

language emerged (Enard et al., 2002). However, more recent evidence suggests

these changes are shared with Neanderthals and occurred 300 to 400 thousand

years ago (Krause et al., 2007). Although the FOXP2 gene does play an important

role in language, it does not appear to provide strong evidence for a genetic

basis for a unique language ability.

The modularity hypothesis has turned out to be a major dividing issue in the

field, with different researchers lining up in support or in opposition. Two domains

of research have played a major role in evaluating the modularity proposal:

1. Language acquisition. Here, the issue is whether language is acquired

according to its own learning principles or whether it is acquired like

other cognitive skills.

2. Language comprehension. Here, the issue is whether major aspects of

language processing occur without utilization of any general cognitive

processes.

We will consider some of the issues with respect to comprehension in the

next chapter. In this chapter, we will look at what is known about language

acquisition. After an overview of the general course of language acquisition by

young children, we will turn to the implications of the language-acquisition

process for the uniqueness of language.

The modularity position holds that the acquisition and processing of language

is independent from other cognitive systems.

•Language Acquisition

Having watched my two children acquire a language, I understand how easy it

is to lose sight of what a remarkable feat it is. Days and weeks go by with little

apparent change in their linguistic abilities. Progress seems slow. However,

something remarkable is happening. With very little and often no deliberate

instruction, children by the time they reach age 10 have accomplished implicitly

Language Acquisition | 343

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 343

what generations of Ph.D. linguists have not accomplished explicitly. They have

internalized all the major rules of a natural language—and there appear to be

thousands of such rules with subtle interactions. No linguist in a lifetime has

been able to formulate a grammar for any language that will identify all and only

the grammatical sentences. However, as we progress through childhood, we do

internalize such a grammar. Unfortunately for the linguist, our knowledge of the

grammar of our language is not something that we can articulate. It is implicit

knowledge (see Chapter 7), which we can only display in using the language.

The process by which children acquire a language has some characteristic

features that seem to hold no matter what their native language is (and

languages throughout the world differ dramatically): Children are notoriously

noisy creatures from birth. At first, there is little variety in their speech. Their

vocalizations consist almost totally of an ah sound (although they can produce

it at different intensities and with different emotional tones). In the months

following birth, a child’s vocal apparatus matures. At about 6 months, a change

takes place in children’s utterances. They begin to engage in what is called

babbling, which consists of generating a rich variety of speech sounds with

interesting intonation patterns. However, the sounds are generally totally meaningless

to the listeners.

An interesting feature of early childhood speech is that children produce

sounds that they will not use in the particular language that they will learn.

Moreover, they can apparently make acoustic discriminations among sounds

that will not be used in their language. For instance, Japanese infants can discriminate

between /l/ and /r/, a discrimination that Japanese adults cannot

make (Tsushima et al., 1994). Similarly, English infants can discriminate among

variations of the /t/ sound, which are important in the Hindi language of India,

that English adults cannot discriminate (Werker & Tees, 1999). It is as if the

children enter the world with speech and perceptual capabilities that constitute

a block of marble out of which will be carved their particular language, discarding

what is not necessary for that language.

When a child is about a year old, the first words appear, always a point of

great excitement to the child’s parents. The very first words are there only to the

ears of very sympathetic parents and caretakers, but soon the child develops a

considerable repertoire of words, which are recognizable to the untrained ear

and which the child uses effectively to make requests and to describe what is

happening. The early words are concrete and refer to the here and now. Among

my children’s first words were Mommy, Daddy, Rogers (for Mister Rogers),

cheese, ’puter (for computer), eat, hi, bye, go, and hot. One remarkable feature of

this stage is that the speech consists only of one-word utterances; even though

the children know many words, they never put them together to make multipleword

phrases. Children’s use of single words is quite complex. They often use a

single word to communicate a whole thought. Children will also overextend

their words. Thus, the word dog might be used to refer to any furry four-legged

animal.

The one-word stage, which lasts about 6 months, is followed by a stage in

which children will put two words together. I can still remember our excitement

as parents when our son said his first two-word utterance at 18 months—more

344 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 344

gee, which meant for him “more brie”—he was a connoisseur of cheese.

Table 12.1 illustrates some of the typical two-word utterances generated by

children at this stage. All their utterances are one or two words. Once their

utterances extend beyond two words, they are of many different lengths.

There is no corresponding three-word stage. The two-word utterances correspond

to about a dozen or so semantic relations, including agent-action,

agent-object, action-object, object-location, object-attribute, possessorobject,

negation-object, and negation-event. The order in which the children

place these words usually corresponds to one of the orders that would

be correct in adult speech in the children’s linguistic community.

Even when children leave the two-word stage and speak in sentences

ranging from three to eight words, their speech retains a peculiar quality, which

is sometimes referred to as telegraphic. Table 12.2 contains some of these longer

multiword utterances. The children speak somewhat as people used to write in

telegraphs (and like people currently do when text messaging), omitting such

unimportant function words as the and is. In fact, it is

rare to find in early-childhood speech any utterance

that would be considered to be a well-formed sentence.

Yet, out of this beginning, grammatical sentences eventually

appear. One might expect that children would

learn to speak some kinds of sentences perfectly, then

learn to speak other kinds of sentences perfectly, and

so on. However, it seems that children start out speaking

all kinds of sentences and all of them imperfectly.

Their language development is characterized not by

learning more kinds of sentences but by their sentences

becoming gradually better approximations of

adult sentences.

Besides the missing words, there are other dimensions in which children’s

early speech is incomplete. A classic example concerns the rules for pluralization

in English. Initially, children do not distinguish in their speech between

singular and plural, using a singular form for both. Then, they will learn the

add s rule for pluralization but overextend it, producing foots or even feets.

Gradually, they learn the pluralization rules for the irregular words. This learning

continues into adulthood. Cognitive scientists have to learn that the plural

of schema is schemata (a fact that I spared the reader from having to deal with

when schemas were discussed in Chapter 5).

Another dimension in which children have to perfect their language is word

order. They have particular difficulties with transformational movements of

terms from their natural position in the phrase structure (see the earlier discussion

in this chapter). So, for instance, there is a point at which children form

questions without moving the verb auxiliary from the verb phrase:

• What me think? • What the doggie have?

Even later, when children’s spontaneous speech seems to be well formed, they

will display errors in comprehension that reveal that they have not yet captured

Language Acquisition | 345

TABLE 12.1

Two-Word Utterances

Kendall swim pillow fell

doggie bark Kendall book

see Kendall Papa door

writing book Kendall turn

sit pool towel bed

shoe off there cow

From Bowerman (1973).

TABLE 12.2

Multiword Utterances

Put truck window My balloon pop

Want more grape juice Doggie bit me mine boot

Sit Adam chair That Mommy nose right there

Mommy put sack She’s wear that hat

No I see truck I like pick dirt up firetruck

Adam fall toy No pictures in there

From Brown (1973).

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 345

all the subtleties in their language. For instance, Chomsky (1970) found that

children had difficulty comprehending sentences such as John promised Bill to

leave, interpreting Bill as the one who leaves. The verb promise is unusual in this

respect—for instance, compare John told Bill to leave, which children will properly

interpret.

By the time children are 6 years old, they have mastered most of their language,

although they continue to pick up details at least until the age of 10. In

that time, they have learned tens of thousands of special case rules and tens of

thousands of words. Studies of the rate of word acquisition by children produced

an estimate of more than five words a day (Carey, 1978; E.V. Clark, 1983).

A natural language requires more knowledge to be acquired for mastery than do

any of the domains of expertise considered in Chapter 9. Of course, children also

put an enormous amount of time into the language-acquisition process—easily

10,000 hr must have been spent practicing speaking and understanding speech

before a child is 6 years old.

Children gradually approximate adult speech by producing ever larger and

more complex constructions.

The Issue of Rules and the Case of Past Tense

A controversy in the study of language acquisition concerns whether children

are learning what might be considered rules such as those that are part of

linguistic theory. For instance, when a child learning English begins to inflect a

verb such as kick with ed to indicate past tense, is that child learning a pasttense

rule or is the child just learning to associate kick and ed? A young child

certainly cannot explicitly articulate the add ed rule, but this inability may just

mean that this knowledge is implicit. An interesting observation in this regard

is that children will generalize the rule to new verbs. If they are introduced to

a new verb (e.g., told that the made-up verb wug means dance) they will

spontaneously generate this verb with the appropriate past tense (wugged in

this example).

Some of the interesting evidence on this score concerns how children learn

to deal with irregular past tenses—for instance, the past tense of sing is sang.

The order in which children learn to inflect verbs for past tense follows the

characteristic sequence noted for pluralization. First, children will use the irregular

correctly, generating sang; then they will overgeneralize the past-tense rule

and generate singed; finally, they will get it right for good and return to sang.

The existence of this intermediate stage of overgeneralization has been used to

argue for the existence of rules, because it is claimed there is no way that the

child could have learned from direct experience to associate ed to sing. Rather,

the argument goes, the child must be overgeneralizing a rule that has been

learned.

This conventional interpretation of the acquisition of past tense was severely

challenged by Rumelhart and McClelland (1986a). They simulated a neural network

as illustrated in Figure 12.6 and had it learn the past tenses of verbs. In the

346 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 346

network, one inputs the root form of a verb (e.g., kick, sing) and, after a number

of layers of association, the past-tense form should appear.

The computer model was trained with a set of 420 pairs of the root with the

past tense. It simulated a neural learning mechanism to acquire the pairs. Such a

system learns to associate features of the input with features of the output. Thus,

it might learn that words beginning with “s” are associated with past tense endings

of “ed,” thus leading to the “singed” overgeneralization (but things can be

more complex in such neural models). The model mirrored the standard developmental

sequence of children, first generating correct irregulars, then overgeneralizing,

and finally getting it right. It went through the intermediate stage of

generating past-tense forms such as singed because of generalization from regular

past-tense forms.With enough practice, the model, in effect, memorized the

past-tense forms and was not using generalization. Rumelhart and McClelland

concluded:

We have, we believe, provided a distinct alternative to the view that children

learn the rules of English past-tense formation in any explicit sense. We have

shown that a reasonable account of the acquisition of past tense can be

provided without recourse to the notion of a “rule” as anything more than a

description of the language. We have shown that, for this case, there is no

induction problem. The child need not figure out what the rules are, nor even

that there are rules. (p. 267)

Their claims drew a major counter-response from Pinker and Prince (1988).

Pinker and Prince pointed out that the ability to produce the initial stage of

Language Acquisition | 347

Fixed

encoding

network

Pattern associator

modifiable connections Decoding/binding

network

Phonological

representation

of root form

Phonological

representation

of past tense

Feature

representation

of root form

Feature

representation

of past tense

FIGURE 12.6 A network for past tense. The phonological representation of the root is

converted into a distributed feature representation. This representation is converted into

the distributed feature representation of the past tense, which is then mapped onto

a phonological representation of the past tense. (From Rumelhart & McClelland, 1986a.)

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 347

correct irregulars depended on Rumelhart and McClelland’s using a disproportionately

large number of irregulars at first—more so than the child experiences.

They had a number of other criticisms of the model, including the fact

that it sometimes produced utterances that children never produce—for instance,

it produced membled as the past tense of mail.

Another of their criticisms had to do with whether it was even possible to

really learn past tense as the process of associating root form with past-tense

form. It turns out that the way a verb is inflected for past tense does not depend

just on its root form but also on its meaning. For instance, the word ring has

two meanings as a verb—to make a sound or to encircle. Although it is the

same root, the past tense of the first is rang, whereas the past tense of the latter

is ringed, as in

• He rang the bell. • They ringed the fort with soldiers.

It is unclear how fundamental any of these criticisms are, and there are now a

number of more adequate attempts to come up with such associative models

(e.g., MacWhinney & Leinbach, 1991; Daugherty, MacDonald, Petersen, &

Seidenberg, 1993; and, for a rejoinder, see Marcus et al., 1995).

Marslen-Wilson and Tyler (1998) argued that the debate between rulebased

and associative accounts will not be settled by focusing only on children’s

language acquisition. They suggest that more decisive evidence will come from

examining properties of the neural system that implements adult processing of

past tense. They cite two sorts of evidence, which seem to converge in their

implications about the nature of the processing of past tense. First, they cite

evidence that some patients with aphasias have deficient processing of regular

past tense, whereas others have deficient processing of irregular past tenses. The

patients with deficient processing of regular past tense have severe damage to

Broca’s area, which is generally associated with syntactic processing. In contrast,

the patients with deficient processing of irregular past tenses have damage to

their temporal lobes, which are generally associated with associative learning.

Second, they cite the PET imaging data of Jaeger et al. (1996), who studied the

processing of past tense by unimpaired adults. Jaeger et al. found activation in

the region of Broca’s area only during the processing of regular past tense and

found temporal activation during the processing of irregular past tenses. On

the basis of the data, Marslen-Wilson and Tyler concluded that regular past

tense may be processed in a rule-based manner, whereas the irregular may be

processed in an associative manner.

Irregular past tenses are produced associatively, and there is debate about

whether regular past tenses are produced associatively or by rules.

Quality of Input

An important difference between a child’s first-language acquisition and the

acquisition of many skills (including typical second-language acquisition) is

that the child receives little if any instruction in acquiring his or her first

348 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 348

language. Thus, the child’s task is one of inducing the structure of natural language

from listening to parents, caretakers, and older children. In addition to

not receiving any direct instruction, the child does not get much information

about what are incorrect forms in natural language. Many parents do not correct

their children’s speech at all, and those who do correct their children’s

speech appear to do so without any effect. Consider the following well-known

interaction recorded between a parent and a child (McNeill, 1966):

Child: Nobody don’t like me.

Mother: No, say, “Nobody likes me.”

Child: Nobody don’t like me.

Mother: No, say, “Nobody likes me.”

Child: Nobody don’t like me.

[dialogue repeated eight times]

Mother: Now listen carefully; say, “Nobody likes me.”

Child: Oh! Nobody don’t likeS me.

This lack of negative information is puzzling to theorists of natural language

acquisition.We have seen that children’s early speech is full of errors. If they are

never told about their errors, why do children ever abandon these incorrect

ways of speaking and adopt the correct forms?

Because children do not get much instruction on the nature of language

and ignore most of what they get, their learning task is one of induction—they

must infer from the utterances that they hear what the acceptable utterances in

their language are. This task is very difficult under the best of conditions, and

children often do not operate under the best of conditions. For instance, children

hear ungrammatical sentences mixed in with the grammatical. How are

they to avoid being misled by these sentences? Some parents and caregivers

are careful to make their utterances to children simple and clear. This kind of

speech, consisting of short sentences with exaggerated intonation, is called

motherese (Snow & Ferguson, 1977). However, not all children receive the

benefit of such speech, and yet all children learn their native languages. Some

parents speak to their children in only adult sentences, and the children learn

(Kaluli, studied by Schieffelin, 1979); other parents do not speak to their children

at all, and still the children learn by overhearing adults speak (Piedmont

Carolinas, studied by Heath, 1983). Moreover, among more typical parents,

there is no correlation between degree to which motherese is used and rate of

linguistic developments (Gleitman, Newport, & Gleitman, 1984). So the quality

of the input cannot be that critical.

Another curious fact is that children appear to be capable of learning a

language in the absence of any input. Goldin-Meadow (2003) summarized

research on the deaf children of speaking parents who chose to teach their

children by the oral method. It is very difficult for deaf children to learn to

speak but quite easy for children to learn sign language, which is a perfectly fine

language. Despite the fact that the parents of these children were not teaching

them sign language, they proceeded to invent their own sign language to communicate

with their parents. These invented languages have the structure of

normal languages. Moreover, the children in the process of invention seem to

Language Acquisition | 349

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 349

go through the same periods as children who are learning a language of their

community. That is, they start out with single manual gestures, then progress to

a two-gesture period, and continue to evolve a complete language more or less

at the same points in time as those of their hearing peers. Thus, children seem

to be born with a propensity to communicate and will learn a language no

matter what.

The very fact that young children learn a language so successfully in almost

all circumstances has been used to argue that the way that we learn language

must be different from the way that we learn other cognitive skills. Also pointed

out is the fact that children learn their first language successfully at a point in

development when their general intellectual abilities are still weak.

Children master language at a very young age and with little direct instruction.

A Critical Period for Language Acquisition

A related argument has to do with the claim that young children appear to

acquire a second language much faster than older children or adults do. It is

claimed that there is a certain critical period, from 2 to about 12 years of age,

when it is easiest to learn a language. For a long time, the claim that children

learn second languages more readily than adults was based on informal observations

of children of various ages and of adults in new linguistic communities—

for example, when families move to a another country in response to a corporate

assignment or when immigrants move to another country to reside there

permanently. Young children are said to acquire a facility to get along in the

new language more quickly than their older siblings or their parents. However,

there are a great many differences between the adults, the older children, and

the younger children in amount of linguistic exposure, type of exposure (e.g.,

whether the stock market, history, or Nintendo is being discussed), and willingness

to try to learn (McLaughlin, 1978; Nida, 1971). In careful studies in which

situations have been selected that controlled for these factors, a positive relation

is exhibited between children’s ages and rate of language development

(Ervin-Tripp, 1974). That is, older children (older than 12 years) learn faster

than younger children.

Even though older children and adults may learn a new language more rapidly

than younger children initially, they seem not to acquire the same level of final

mastery of the fine points of language, such as the phonology and morphology

(Lieberman, 1984; Newport, 1986). For instance, the ability to speak a second

language without an accent severely deteriorates with age (Oyama, 1978). In

one study, Johnson and Newport (1989) looked at the degree of proficiency

in speaking English achieved by Koreans and Chinese as a function of the age

at which they arrived in America. All had been in the United States for about

10 years. In general, it seems that the later they came to America, the poorer

their performance was on a variety of measures of syntactic facility. Thus,

although it is not true that language learning is fastest for the youngest, it is does

seem that the greatest eventual mastery of the fine points of language are

achieved by those who start very young.

350 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 350

Figure 12.7 shows some data from Flege,Yeni-Komshian, and Liu (1999) looking

at the performance of 240 Korean immigrants to the United States. For measures

of both foreign accent and syntactic errors, there is a steady decrease in performance

with age of arrival in the United States. The data give some suggestion

of a more rapid drop around the age of 10—which would be consistent with the

hypothesis of a critical period in language acquisition. However, age of arrival

turns out to be confounded with many other things, and one critical factor is the

relative use of Korean versus English. Based on questionnaire

data, Flege et al. rated these participants with respect to the

relative frequency with which they used English versus

Korean. Figure 12.8 displays this data and shows that there

is a steady decrease in use of English to about the point of

the critical period at which participants reported approximately

equal use of the two languages. Perhaps the decrease

in English performance reflects this difference in amount

of use. To address this question, Flege et al. created two

matched groups (subsets of the original 240) who reported

equal use of English, but one group averaged 9.7 years

when they arrived in the United States and the other group

averaged 16.2. The two groups did not differ on measures

of syntax, but the later arriving group still showed a

stronger accent. Thus, it seems that there may not be a critical

period for acquisition of syntactic knowledge but there

may be one for acquisition of phonological knowledge.

Language Acquisition | 351

Mean foreign accent rating

0 5 10 15 20

(a) Age of arrival in the US

Morphosyntax scores (% correct)

90

100

80

70

60

50

0 5 10 15 20

(b) Age of arrival in the US

Native English

Native Korean

Native English

Native Korean

FIGURE 12.7 Mean language scores of 24 native English speakers and 240 native Korean

participants as a function of age of arrival in the United States. (a) Scores on test of foreign

accent (lower scores mean stronger accent) and (b) scores on tests of morphosyntax (lower

scores mean more errors). (From Flege et al., 1999.)

Ratio of English/Korean use

1.2

1.4

1.6

1.8

2.0

2.2

2.4

1.0

0.8

0 5 10 15 20

Age of arrival (years)

FIGURE 12.8 Relative use

of English versus Korean as a

function of age of arrival in

the United States. (From Flege

et al., 1999.)

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 351

352 | Language Structure

Grammatical processing in bilinguals

Age of second-language

acquisition

1–3 years

4–6 years

11–13 years

FIGURE 12.9 ERP patterns produced in response to grammatical anomalies in English in left

and right hemispheres. (From Weber-Fox & Neville, 1996.)

Weber-Fox and Neville (1996) presented an interesting analysis of the effects

of age of acquisition of language processing. They compared Chinese-English

bilinguals who had learned English as a second language at different ages. One

of their tests included an ERP measurement of sensitivity to syntactic violations

in English. English monolinguals show a strong left lateralization in their

response to such violations, which is a sign of the left lateralization of language.

Figure 12.9 compares the two hemispheres in these adult bilinguals as a function

of the age at which they acquired English. Adults who had learned English

in their first years of life show strong left lateralization like those who learn

English as a first language. If they were delayed in their acquisition to ages between

12 and 13, they show almost no lateralization. Those who had acquired

English at an intermediate age show an intermediate amount of lateralization.

Interestingly,Weber-Fox and Neville reported no such critical period for lexical

or semantic violations. Learning English as late as 16 years of age had almost no

effect on the lateralization of their responses to semantic violations. Thus,

grammar seems to be more sensitive to a critical period.

Most studies on the effect of age of acquisition have naturally concerned

second languages. However, an interesting study of first-language acquisition

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 352

Language Acquisition | 353

was done by Newport and Supalla (1990). They looked at the acquisition of

American sign language, one of the few languages that is acquired as a first

language in adolescence or adulthood. Deaf children of speaking parents are

sometimes not exposed to the sign language until late in life and consequently

acquire no language in their early years. Adults who acquire sign language

achieve a poorer ultimate mastery of it than children do.

There are age-related differences in the success with which children can acquire

a new language, with the strongest effects on phonology, intermediate

effects on syntax, and weakest effects on semantics.

Language Universals

Chomsky (1965) argued that special innate mechanisms underlie the acquisition

of language. Specifically, his claim is that the number of formal possibilities for a

natural language is so great that learning the language would simply be impossible

unless we possessed some innate information about the possible forms of

natural human languages. It is possible to prove formally that Chomsky is correct

in his claim. Although the formal analysis is beyond the scope of this book, an

analogy might help. In Chomsky’s view, the problem that child learners face is to

discover the grammar of their language when only given instances of utterances

of the language. The task can be compared to trying to find a matching sock

(language) from a huge pile of socks (set of possible languages). One can use

various features (utterances) of the sock in hand to determine whether any particular

sock in the pile is the matching one. If the pile of socks is big enough and

the socks are similar enough, this task would prove to be impossible. Likewise,

enough formally possible grammars are similar enough to one another to make it

impossible to learn every possible instance of a formal language. However, because

language learning obviously occurs, we must, according to Chomsky, have

special innate knowledge that allows us to substantially restrict the number of

possible grammars that we have to consider. In the sock analogy, it would be like

knowing ahead of time which part of the pile to inspect. So, although we cannot

learn all possible languages, we can learn a special subset of them.

Chomsky proposed the existence of language universals that limit the possible

characteristics of a natural language and a natural grammar. He assumes

that children can learn a natural language because they possess innate knowledge

of these language universals. A language that violated these universals

would simply be unlearnable, which means that there are hypothetical languages

that no humans could learn. Languages that humans can learn are referred to

as natural languages.

As already noted, we can formally prove that Chomsky’s assertion is

correct—that is, constraints on the possible forms of a natural language must

exist. However, the critical issue is whether these constraints are due to any

linguistic-specific knowledge on the part of children or whether they are simply

general cognitive constraints on learning mechanisms. Chomsky would argue

that the constraints are language specific. It is this claim that is open to serious

question. The issue is: Are the constraints on the form of natural languages

universals of language or universals of cognition?

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 353

In speaking of language universals, Chomsky is concerned with a competence

grammar. Recall that a competence analysis is concerned with an abstract

specification of what a speaker knows about a language; in contrast, a performance

analysis is concerned with the way in which a speaker uses language.

Thus, Chomsky is claiming that children possess innate constraints about the

types of phrase structures and transformations that might be found in a natural

language. Because of the abstract, nonperformance-based character of these

purported universals, one cannot simply evaluate Chomsky’s claim by observing

the details of acquisition of any particular language. Rather, the strategy is to

look for properties that are true of all languages or of the acquisition of all

languages. These universal properties would be manifestations of the language

universals that Chomsky postulates.

Although languages can be quite different from one another, some clear uniformities,

or near-uniformities, exist. For instance, as we saw earlier, virtually no

language favors the object-before-subject word order. However, as noted, this

constraint appears to have a cognitive explanation (as do many other limits on

language form).

Often, the uniformities among languages seem so natural that we do not

realize that other possibilities might exist. One such language universal is that

adjectives appear near the nouns that they modify. Thus, we translate The brave

woman hit the cruel man into French as • La femme brave a frappé l’homme cruel

and not as • La femme cruel a frappé l’homme brave

although a language in which the adjective beside the subject noun modified

the object noun and vice versa would be logically possible. Clearly, however,

such a language design would be absurd in regard to its cognitive demands. It

would require that listeners hold the adjective from the beginning of the sentence

until the noun at the end. No natural language has this perverse structure.

If it really needed showing, I showed with artificial languages that adult participants

were unable to learn such a language (Anderson, 1978b). Thus, many of

the universals of language seem cognitive in origin and so do not really support

Chomsky’s position. In the next subsections, we will consider some universals

that seem more language specific.

There are universal constraints on the kinds of languages that humans can learn.

The Constraints on Transformations

A set of peculiar constraints on movement transformations (refer to the subsection

on transformations on page 332) has been used to argue for the existence of

linguistic universals. One of the more extensively discussed of these constraints

is called the A-over-A constraint. Compare sentence 1 with sentence 2:

1. Which woman did John meet who knows the senator?

2. Which senator did John meet the woman who knows?

354 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 354

Linguists would consider sentence 1 to be acceptable but not sentence 2. Sentence

1 can be derived by a transformation from sentence 3. This transformation

moves which woman forward:

3. John met which woman who knows the senator?

4. John met the woman who knows which senator?

Sentence 2 could be derived by a similar transformation operating on which

senator in sentence 4, but apparently transformations are not allowed that move

a noun phrase such as which senator if it is embedded within another noun

phrase (in this case, which senator is part of the clause modifying the woman

and so is part of the noun phrase associated with the woman). Transformations

can move deeply embedded nouns if these nouns are not in clauses modifying

other nouns. So, for instance, sentence 5, which is acceptable, is derived transformationally

from sentence 6:

5. Which senator does Mary believe that Bill said that John likes?

6. Mary believes that Bill said that John likes which senator?

Thus, we see that the constraint on the transformation that forms which questions

is arbitrary. It can apply to any embedded noun unless that noun is part

of another noun phrase. The arbitrariness of this constraint makes it hard to

imagine how a child would ever figure it out—unless the child already knew it

as a universal of language. Certainly, the child is never explicitly told this fact

about language.

The existence of such constraints on the form of language offers a challenge

to any theory of language acquisition. The constraints are so peculiar that it is

hard to imagine how they could be learned unless a child was especially prepared

to deal with them.

There are rather arbitrary constraints on the movements that transformations

can produce.

Parameter Setting

With all this discussion about language universals, one might get the impression

that all languages are basically alike. Far from it. On many dimensions, the

languages of the world are radically different. They might have some abstract

properties in common, such as the transformational constraint discussed above,

in common, but there are many properties on which they differ. As already

mentioned, different languages prefer different orders for subject, verb, and

object. Languages also differ in how strict they are about word order. English is

very strict, but some highly inflected languages, such as Finnish, allow people to

say their sentences with almost any word order they choose. There are languages

that do not mark verbs for tense and languages that mark verbs for the flexibility

of the object being acted on.

Another example of a difference, which has been a focus of discussion, is

that some languages, such as Italian or Spanish, are what are called pro-drop

Language Acquisition | 355

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 355

languages: They allow one to optionally drop the pronoun when it appears in

the subject position. Thus, whereas in English we would say, I am going to the

cinema tonight, Italians can say, Vado al cinema stasera, and Spaniards, Voy al

cine esta noche—in both cases, just starting with the verb and omitting the firstperson

pronoun. It has been argued that pro-drop is a parameter on which natural

languages vary, and although children cannot be born knowing whether

their language is pro-drop or not, they can be born knowing it is one way or the

other. Thus, knowledge that the pro-drop parameter exists is one of the purported

universals of natural language.

Knowledge of a parameter such as pro-drop is useful because a number of

features are determined by it. For instance, if a language is not pro-drop, it requires

what are called expletive pronouns. In English, a non-pro-drop language,

the expletive pronouns are it and there when they are used in sentences such

as It is raining or There is no money. English requires these rather semantically

empty pronouns because, by definition, a non-pro-drop language cannot have

empty slots in the subject position. Pro-drop languages such as Spanish and

Italian lack such empty pronouns because they are not needed.

Hyams (1986) argued that children starting to learn any language, including

English, will treat it as a pro-drop language and optionally drop pronouns

even though doing so may not be correct in the adult language. She noted that

young children learning English tend to omit subjects. They will also not use

expletive pronouns, even when they are part of the adult language. When

children in a non-pro-drop language start using expletive pronouns, they

simultaneously optionally stop dropping pronouns in the subject position.

Hyams argued that, at this point, they learn that their language is not a prodrop

language. For further discussion of Hyams’s proposal and alternative

formulations, read R. Bloom (1994).

It is argued that much of the variability among natural languages can be accommodated

by setting 100 or so parameters, such as the pro-drop parameter,

and that a major part of learning a language is learning the setting of these

parameters (of course, there is a lot more to be learned than just this setting—

e.g., an enormous vocabulary). This theory of language acquisition is called

the parameter setting proposal. It is quite controversial, but it provides us with

one picture of what it might mean for a child to be prepared to learn a language

with innate, language-specific knowledge.

Learning the structure of language has been proposed to include learning the

setting of 100 or so parameters on which natural languages vary.

•Conclusions: The Uniqueness of Language:

A Summary

Although it is clear that human language is a very unique communication

system relative to those of other species, the jury is still very much out on the

issue of whether language is really a system different from other human cognitive

systems. The status of language is a major issue for cognitive psychology.

356 | Language Structure

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 356

The issue will be resolved by empirical and theoretical efforts more detailed

than those reviewed in this chapter. The ideas here have served to define the

context for the investigation. The next chapter will review the current state of

our knowledge about the details of language comprehension. Careful experimental

research on such topics will finally resolve the question of the uniqueness

of language.

Key Terms | 357

1. There have emerged a number of computer-based

approaches to representing meaning that are based on

having these programs read through large sets of documents

and representing the meaning of a word in terms

of what other words also occurred with it in these

documents. One interesting feature of these efforts is

that they have no knowledge of the physical world

and what these words refer to. Perhaps the most

well-known system is called Latent Semantic Analysis

(LSA—Landauer, Foltz, & Laham, 1998). The authors

of LSA describe the knowledge in their system as

“analogous to a well-read nun’s knowledge of sex, a

level of knowledge often deemed a sufficient basis for

advising the young” (p. 5). Based on this knowledge,

LSA was able to pass the vocabulary test from the Educational

Testing Service’s Test of English as a Foreign

Language. The test requires that one choose which of

four alternatives best matches the meaning of a word,

and LSA was able to do this by comparing its meaning

representation of the word (based on what documents

the word appeared in) with its meaning representation

of the alternatives (again based on the same information).

Why do you think such a program is so successful?

How would you devise a vocabulary test to expose

aspects of meaning that it does not represent?

2. In addition to the pauses and speech errors discussed in

the chapter, spontaneous speech contains fillers like uh

and um in English (different languages use different

fillers). Clark and Fox Tree (2002) report that um tends

to be associated with a longer delay in speech than uh.

In terms of phrase structure, where would you expect

to see uh and um located?

3. Some languages assign grammatical genders to words

that do not have inherent genders and appear to do so

arbitrarily. So, for instance, the German word for key

is masculine and the Spanish word for key is feminine.

Boroditsky, Schmidt, and Phillips (2003) report that

when asked to describe a key, German speakers are

more likely to use words like hard and jagged, whereas

Spanish speakers are more likely to use words like

shiny and tiny.What does evidence like this say about

the relationship between language and thought?

4. When two linguistic communities often come into contact

such as in trade, they develop simplified languages,

called pidgins, for communicating. These languages are

generally considered not full natural languages. However,

if these language communities live together, the

pidgins will evolve into full-fledged new languages

called creoles. This can happen in one generation, in

which the parents who first made contact with the new

linguistic community continue to use pidgin, whereas

their children are speaking full-fledged creoles.What

does this say about the possible role of a critical period

in language acquisition?

Questions for Thought

Key Terms

competence

grammar

language universals

linguistic determinism

linguistic intuitions

linguistics

modularity

natural languages

parameter setting

performance

phonology

phrase structure

productivity

regularity

semantics

syntax

transformation

Anderson7e_Chapter_12.qxd 8/20/09 9:53 AM Page 357

358

13

Language Comprehension

Afavorite device in science fiction is the computer or robot that can understand

and speak language—whether evil like HAL in 2001, or beneficent like C3PO in

Star Wars. Workers in artificial intelligence have been trying to develop computers

that understand and generate language. Progress is being made, but Stanley Kubrick

was clearly incorrect when he projected HAL for the year 2001. Language-processing

AI programs are still rudimentary compared with what is portrayed in science fiction.

An enormous amount of knowledge and intelligence underlies the successful use of

language.

This chapter will look at language use and, in particular, at language comprehension

(as distinct from language generation). This focus will enable us to look where

the light is—more is known about language comprehension than about language generation.

Language comprehension will be considered in regard to both listening and

reading. The listening process is often thought to be the more basic of the two.

However, many of the same factors apply to both listening and reading. Researchers’

choice between written or spoken material is determined by what is easier to do

experimentally. More often than not, written material is used.

We will consider a detailed analysis of the process of language comprehension,

breaking it down into three stages. The first stage involves the perceptual

processes that encode the spoken (acoustic) or written message. The second stage

is termed the parsing stage. Parsing is the process by which the words in the message

are transformed into a mental representation of the combined meaning of the

words. The third stage is the utilization stage, in which comprehenders use the

mental representation of the sentence’s meaning. If the sentence is an assertion,

listeners may simply store the meaning in memory; if it is a question, they may

answer; if it is an instruction, they may obey. However, listeners are not always so

compliant. They may use an assertion about the weather to make an inference

about the speaker’s personality, they may answer a question with a question, or

they may do just the opposite of what the speaker asks. These three stages—

perception, parsing, and utilization—are by necessity partly ordered in time; however,

they also partly overlap. Listeners can make inferences from the first part of a

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 358

Brain and Language Comprehension | 359

sentence while they are perceiving a later part. This chapter will focus on the two

higher-level processes—parsing and utilization. (The perceptual stage was discussed

in Chapter 2.)

In this chapter, we will answer the questions: • How are individual words combined into the meaning of phrases? • How is syntactic and semantic information combined in sentence

interpretation? • What inferences do comprehenders make as they hear a sentence? • How are meanings of individual sentences combined in the processing

of larger units of discourse?

•Brain and Language Comprehension

Figure 12.1 highlighted the classic language-processing regions that are active

in the parsing stage that involves single sentences. However, when we consider

the utilization stage and the processing involved in larger portions of

text, we find many other regions of the brain active. Figure 13.1 illustrates

some of the regions identified by Mason and Just (2006) in discourse processing

(for a richer representation of all the areas, see Color Plate 13.1). One can

take the union of Figures 12.1 and 13.1 as something closer to the total brain

network involved in language processing. These figures make clear the fact

that language comprehension involves much of the brain and many cognitive

processes.

Comprehension consists of a perceptual stage, a parsing stage, and a utilization

stage, in that order.

Brain Structures

Coherence monitoring

network

Text integration

network Coarse semantic

processing network

Spatial imagery

network

FIGURE 13.1 A representation

of some of the brain regions

involved in discourse processing.

(From Mason & Just, 2006.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 359

•Parsing

Constituent Structure

Language is structured according to a set of rules that tell us how to go from a

particular string of words to an interpretation of that string’s meaning. For

instance, in English we know that if we hear a sequence of the form A noun

action a noun, the speaker means that an instance of the first noun performed

the action on an instance of the second noun. In contrast, if the sentence is of

the form A noun was action by a noun, the speaker means that an instance of the

second noun performed the action on an instance of the first noun. Thus, our

knowledge of the structure of English allows us to grasp the difference between

A doctor shot a lawyer and A doctor was shot by a lawyer.

In learning to comprehend a language, we acquire a great many rules that

encode the various linguistic patterns in language and relate these patterns to

meaningful interpretations. However, we cannot possibly learn rules for every

possible sentence pattern—sentences can be very long and complex. A very

large (probably infinite) number of patterns would be required to encode all

possible sentence forms. Although we have not learned to interpret all possible

full-sentence patterns, we have learned to interpret subpatterns, or phrases, of

these sentences and to combine, or concatenate, the interpretations of these

subpatterns. These subpatterns correspond to basic phrases, or units, in a sentence’s

structure. These phrase units are also referred to as constituents. From

the late 1950s to the early 1980s, a series of studies were performed that established

the psychological reality of phrase structure (or constituent structure) in

language processing. Chapter 12 reviewed some of the research documenting

the importance of phrase structure in language generation. Here, we review

some of the evidence for the psychological reality of this constituent structure

in comprehension.

We might expect that the more clearly identifiable the constituent structure

of a sentence is, the more easily the sentence can be understood. Graf and

Torrey (1966) presented sentences to participants a line at a time. The passages

could be presented in form A, in which each line corresponded to a major constituent

boundary, or in form B, in which there was no such correspondence.

Examples of the two types of passages follow:

Form A Form B

During World War II During World War

even fantastic schemes II even fantastic

received consideration schemes received

if they gave promise consideration if they gave

of shortening the conflict. promise of shortening the conflict.

Participants showed better comprehension of passages in form A. This finding

demonstrates that the identification of constituent structure is important to the

parsing of a sentence.

When people read such passages, they naturally pause at boundaries between

clauses. Aaronson and Scarborough (1977) asked participants to read sentences

360 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 360

displayed word by word on a computer screen. Participants would press a key

each time they wanted to read another word. Figure 13.2 illustrates the pattern of

reading times for a sentence that participants were reading for later recall. Notice

the U-shaped patterns with prolonged pauses at the phrase boundaries.With the

completion of each major phrase, participants seemed to need time to process it.

After one has processed the words in a phrase in order to understand it,

there is no need to make further reference to these exact words. Thus, we might

predict that people would have poor memory for the exact wording of a constituent

after it has been parsed and the parsing of another constituent has

begun. The results of an experiment by Jarvella (1971) confirm this prediction.

He read to participants passages with interruptions at various points. At each

interruption, participants were instructed to write down as much of the passage

as they could remember. Of interest were passages that ended with 13-word

sentences such as the following one:

1 2 3 4 5 6

Having failed to disprove the charges,

7 8 9 10 11 12 13

Taylor was later fired by the president.

After hearing the last word, participants were prompted with the first word of

the sentence and asked to recall the remaining words. Each sentence was

composed of a 6-word subordinate clause followed by a 7-word main clause.

Figure 13.3 plots the probability of recall for each of the remaining 12 words

in the sentence (excluding the first, which was used as a prompt). Note the

sharp rise in the function at word 7, the beginning of the main clause. These

data show that participants have best memory for the last major constituent, a

result consistent with the hypothesis that they retain a verbatim representation

of the last constituent only.

An experiment by Caplan (1972) also presents evidence for the use of constituent

structure, but this study used a reaction-time methodology. Participants

Parsing | 361

Word in sentence

Because

of

its

its

lasting

construction

as

well

as

motor's

power

the

boat

was of high

quality

Time (s)

.3

.4

.5

.6

.7

.8

.9

FIGURE 13.2 Word-by-word reading times for a sample sentence. The short-line markers on the

graph indicate breaks between phrase structures. (Adapted from Aaronson & Scarborough, 1977.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 361

were presented aurally first with a sentence and then with a probe word; they

then had to indicate as quickly as possible whether the probe word was in the

sentence. Caplan contrasted pairs of sentences such as the following pair:

1. Now that artists are working fewer hours oil prints are rare.

2. Now that artists are working in oil prints are rare.

Interest focused on how quickly participants would recognize oil in these two

sentences when probed at the ends of the sentences. The sentences were cleverly

constructed so that, in both sentences, the word oil was fourth from the end

and was followed by the same words. In fact, by splicing tape, Caplan arranged

the presentation so that participants heard the same recording of these last four

words whichever full sentence they heard. However, in sentence 1, oil is part of

the last constituent, oil prints are rare, whereas, in sentence 2, it is part of the

first constituent, now that artists are working in oil. Caplan predicted that participants

would recognize oil more quickly in sentence 1 because they would

still have active in memory a representation of this constituent. As he predicted,

the probe word was recognized more rapidly if it was in the last constituent.

Participants process the meaning of a sentence one phrase at a time and

maintain access to a phrase only while processing its meaning.

Immediacy of Interpretation

An important principle to emerge in more recent studies of language processing

is called the principle of immediacy of interpretation. Basically, this principle

says that people try to extract meaning out of each word as it arrives and

362 | Language Comprehension

1.0

0.9

0.8

0.7

1 2 3 4 5 6 7 8 9 10 11 12 13

Ordinal position of word

Proportion correct recall

FIGURE 13.3 Probability of recalling a word as a function of its position in the last 13 words in

a passage. (Adapted from Jarvella, 1971.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 362

do not wait until the end of a sentence or even the end of a phrase to decide

how to interpret a word. For instance, Just and Carpenter (1980) studied the

eye movements of participants as they read a sentence.While reading a sentence,

participants will typically fixate on almost every word. Just and Carpenter

found that the time spent fixating on a word is proportional to the amount of

information provided by the word. Thus, if a sentence contains an unfamiliar

or a surprising word, participants pause on that word. They also pause

longer at the end of the phrase containing that word. Figure 13.4 illustrates

the eye fixations of one of their college students reading a scientific passage.

The circles are above the words the student fixated on, and in each circle is the

duration of that fixation. The order of the gazes is left to right except for

the three gazes above engine contains, where the order of gazes is indicated.

Note that unimportant function words such as the and to may be skipped or,

if not skipped, receive relatively little processing. Note the

amount of time spent on the word flywheel. The participant

did not wait until the end of the sentence to think

about this word. Again, look at the amount of time spent

on the highly informative adjective mechanical—the participant

did not wait until the end of the noun phrase to

think about it.

Eye movements have also been used to study the comprehension

of spoken language. In one of these studies

(Allopenna, Magnuson, & Tanenhaus, 1998), participants

were shown computer displays of objects like that in Figure

13.5 and processed instructions such as

Pick up the beaker and put it below the diamond.

Participants would perform this action by selecting the

object with a mouse and moving it, but the experiment

was done to study their eye movements that preceded any

Parsing | 363

1,566 267 100 83 267 617 767 150 150 100

483 450 383 281 383 317 283 533 50 366 566

616 517 1,116

684

250 317 617 367 467

Flywheels are one of the oldest mechanical devices known to man. Every

internal-combustion engine contains a small flywheel that converts the jerky

motion of the pistons into the smooth flow of energy that powers the drive shaft.

FIGURE 13.4 The time spent by a college reader on the words in the opening two sentences of

a technical article about flywheels. The times, indicated above the fixated word, are expressed

in milliseconds. This reader read the sentences from left to right, with one regressive fixation to

an earlier part. (Adapted from Just & Carpenter, 1980.)

FIGURE 13.5 An example of a

computer display used in the

study of Allopenna et al. (1998).

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 363

mouse action. Figure 13.6 shows the probabilities

that participants fixate on various items in the

display as a function of time since the beginning of

the articulation of “beaker.” It can be seen that participants

are beginning to look to the two items that

start with the same sound (“beaker” and “beetle”)

even before the articulation of the word finishes.

It takes about 400 msec to say the word. Almost

immediately upon offset of the word, their fixations

on the wrong item (“beetle”) decrease and their

fixations on the correct item (“beaker”) shoot up.

Given that it takes about 200 msec to program an

eye movement, this study provides evidence that

participants are processing the meaning of a word

even before it completes.

This immediacy of processing implies that we

will begin to interpret a sentence even before we encounter

the main verb. Sometimes we are aware of

wondering what the verb will be as we hear the sentence.We are likely to experience

something like this in constructions that put the verb last. Consider what

happens as we process the following sentence: • It was the most expensive car that the CEO bought.

Before we get to bought, we already have some idea of what might be happening

between the CEO and the car. Although this sentence structure with the verb at

the end is unusual for English, it is not unusual for languages such as German.

Listeners of these languages do develop strong expectations about the sentence

before seeing the verb (see Clifton & Duffy, 2001, for a review).

If people process a sentence as each word comes in, why is there so much

evidence for the importance of phrase-structure boundaries? The evidence reflects

the fact that the meaning of a sentence is defined in terms of the phrase

structure, and, even if listeners try to extract all they can from each word, they

will be able to put some things into place only when they reach the end of a

phrase. Thus, people often need extra time at a phrase boundary to complete

this processing. People have to maintain a representation of the current phrase

in memory because their interpretation of it may be wrong, and they may have

to reinterpret the beginning of the phrase. Just and Carpenter (1980) in their

study of reading times found that participants tend to spend extra time at the

end of each phrase in wrapping up the meaning conveyed by that phrase.

In processing a sentence, we try to extract as much information as possible from

each word and spend some additional wrap-up time at the end of each phrase.

The Processing of Syntactic Structure

The basic task in parsing a sentence is to combine the meanings of the individual

words to arrive at a meaning for the overall sentence. There are two basic

364 | Language Comprehension

0 200

0.2

0.4

0.6

0.8

1.0

400 600 800

Time since target onset (msec)

Fixation probability

1000

Average target offset

Referent (e.g., “beaker”)

Cohort (e.g., “beetle”)

Unrelated (e.g., “carriage”)

FIGURE 13.6 Probability of

fixating different items in the

display as a function of time

from onset of the critical word

beetle. (From Allopenna et al., 1998.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 364

sources of syntactic information that can guide us in this task. One source is

word order and the other is inflectional structure. The following two sentences,

although they have identical words, have very different meanings:

1. The dog bit the cat.

2. The cat bit the dog.

The dominant syntactic cue in English is word order. Other languages rely less

on word order and instead use inflections of words to indicate semantic role.

There is a small remnant of such an inflectional system in some English pronouns.

For instance, he and him, I and me, and so on, signal subject versus

object. McDonald (1984) compared English with German, which has a richer

inflectional system. She asked her English participants to interpret sentences

such as

3. Him kicked the girl.

4. The girl kicked he.

The word-order cue in these sentences suggests one interpretation, whereas the

inflection cue suggests an alternative interpretation. English speakers use the

word-order cue, interpreting sentence 3 with him as the subject and the girl

as the object. German speakers, judging comparable sentences in German, do

just the opposite. Bilingual speakers of both German and English tend to interpret

the English sentences more like German sentences; that is, they assign him

in sentence 3 to the object role and girl to the subject role.

An interesting case of combining word order and inflection in English

requires the use of relative clauses. Consider the following sentence:

5. The boy the girl liked was sick.

This sentence is an example of a center-embedded sentence: One clause, the

girl liked (the boy), is embedded in another clause, The boy was sick. As we will

see, there is evidence that people have difficulty with such clauses, perhaps in

part because the beginning of the sentence is ambiguous. For instance, the sentence

could have concluded as follows:

6. The boy the girl and the dog were sick.

To prevent such ambiguity, English offers relative pronouns, which are effectively

like inflections, to indicate the role of the upcoming words:

7. The boy whom the girl liked was sick.

Sentences 5 and 7 are equivalent except that in sentence 5 whom is deleted,

which indicates that the upcoming words are part of an embedded clause.

One might expect that it is easier to process sentences if they have relative

pronouns to signal the embedding of clauses. Hakes and Foss (1970; Hakes,

1972) tested this prediction by using the phoneme-monitoring task. They used

double-embedded sentences such as

8. The zebra which the lion that the gorilla chased killed was running.

9. The zebra the lion the gorilla chased killed was running.

Parsing | 365

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 365

The only difference between sentences 8 and 9 is whether there are relative

pronouns. Participants were required to perform two simultaneous tasks. One

task was to comprehend and paraphrase the sentence. The second task was

to listen for a particular phoneme—in this case a /g/ (in gorilla). Hakes and

Foss predicted that the more difficult a sentence was to comprehend, the more

time participants would take to detect the target phoneme, because they would

have less attention left over from the comprehension task with which to perform

the monitoring. In fact, the prediction was confirmed; participants did take

longer to indicate hearing /g/ when presented with sentences such as sentence 9,

which lacked relative pronouns.

Although the use of relative pronouns facilitates the processing of such sentences,

there is evidence that center-embedded sentences are quite difficult even

with the relative pronouns. In one experiment, Caplan, Alpert, Waters, and

Olivieri (2000) compared center-embedded sentences such as

10. The juice that the child enjoyed stained the rug.

with comparable sentences that are not center-embedded such as

11. The child enjoyed the juice that stained the rug.

They used PET brain-imaging measures to detect processing differences and

found greater activation in Broca’s area with center-embedded sentences. Broca’s

area is usually found to be more active when participants have to deal with more

complex sentence structures (Martin, 2003).

People use the syntactic cues of word order and inflection to help interpret

a sentence.

Semantic Considerations

People use syntactic patterns, such as those illustrated in the preceding subsection,

for understanding sentences, but they can also make use of the meanings

of the words themselves. A person can determine the meaning of a string of

words simply by considering how they can be put together so as to make sense.

Thus, when Tarzan says, Jane fruit eat, we know what he means even though

this sentence does not correspond to the syntax of English. We realize that a

relation is being asserted between someone capable of eating and something

edible.

Considerable evidence suggests that people use such semantic strategies in

language comprehension. Strohner and Nelson (1974) had 2- and 3-year-old

children use animal dolls to act out the following two sentences:

• The cat chased the mouse. • The mouse chased the cat.

In both cases, the children interpreted the sentence to mean that the cat chased

the mouse, a meaning that corresponded to their prior knowledge about cats

and mice. Thus, these young children were relying more heavily on semantic

patterns than on syntactic patterns.

366 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 366

Fillenbaum (1971, 1974) had adults paraphrase sentences, among which

were “perverse” items such as • John was buried and died.

More than 60% of the participants paraphrased the sentences in a way that gave

them a more conventional meaning; for example, that John died first and then

was buried. However, the normal syntactic interpretation of such constructions

would be that the first activity occurred before the second, as in • John had a drink and went to the party.

in contrast with • John went to the party and had a drink.

So, when a semantic principle is placed in conflict with a syntactic principle, the

semantic principle will sometimes (but not always) determine the interpretation

of the sentence. If you have any doubt about the power of semantics to

dominate syntax, consider the following sentence

No head injury is too trivial to be ignored.

If you interpreted this sentence to mean that no head injury should be ignored,

you are in the vast majority (Wason & Reich, 1979). However, a careful inspection

of the syntax will indicate that the “correct” meaning is that all head

injuries should be ignored—consider “No missile is too small to be banned”—

which means all missiles should be banned.

Sometimes people rely on the plausible semantic interpretation of words in

a sentence.

The Integration of Syntax and Semantics

Listeners appear to combine both syntactic and semantic information in comprehending

a sentence. Tyler and Marslen-Wilson (1977) asked participants to

try to continue fragments such as

1. If you walk too near the runway, landing planes are

2. If you’ve been trained as a pilot, landing planes are

The phrase landing planes, by itself, is ambiguous. It can mean either “planes

that are landing” or “to land planes.”However, when followed by the plural verb

are, the phrase must have the first meaning. Thus, the syntactic constraints

determine a meaning for the ambiguous phrase. The prior context in fragment 1

is consistent with this meaning, whereas the prior context in fragment 2 is not.

Participants took less time to continue fragment 1, which suggests that they

were using both the semantics of the prior context and the syntax of the current

phrase to disambiguate landing planes. When these factors are in conflict,

the participant’s comprehension is slowed.1

Parsing | 367

1 The original Tyler and Marslen-Wilson experiment drew methodological criticisms from Townsend and

Bever (1982) and Cowart (1983). For a response, read Marslen-Wilson and Tyler (1987).

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 367

Bates, McNew, MacWhinney, Devesocvi, and Smith (1982) looked at the

matter of combining syntax and semantics in a different paradigm. They had

participants interpret word strings such as • Chased the dog the eraser

If you were forced to, what meaning would you assign to this word string? The

syntactic fact that objects follow verbs seems to imply that the dog was being

chased and the eraser did the chasing. The semantics, however, suggest the opposite.

In fact, American speakers prefer to go with the syntax but will sometimes

adopt the semantic interpretation—that is, most say The eraser chased the

dog, but some say The dog chased the eraser. On the other hand, if the word

string is • Chased the eraser the dog

listeners agree on the interpretation—that is, that the dog chased the eraser.

Another interesting part of the study by Bates et al. compared Americans

with Italians. When syntactic cues were put in conflict with semantic cues,

Italians tended to go with the semantic cues, whereas Americans preferred the

syntactic cues. The most critical case concerned sentences such as • The eraser bites the dog

or its Italian translation: • La gomma morde il cane

Americans almost always followed the syntax and interpreted this sentence to

mean that the eraser is doing the biting. In contrast, Italians preferred to use the

semantics and interpret that the dog is doing the biting. Like English, however,

Italian has a subject-verb-object syntax.

Thus, we see that listeners combine both syntactic and semantic cues in

interpreting the sentence. Moreover, the weighting of these two types of cues

can vary from language to language. This evidence and other results indicate

that speakers of Italian weight semantic cues more heavily than do speakers of

English.

People integrate semantic and syntactic cues to arrive at an interpretation

of a sentence.

Neural Indicants of Syntactic and Semantic Processing

Researchers have found two indicants of sentence processing in event related

potentials (ERPs) recorded from the brain. First, there is the N400, which is

an indicant of difficulty in semantic processing. It was originally identified as

a response to semantic anomaly, although it is more general than that. Kutas

and Hillyard (1980a, 1980b) discovered the N400 in their original experiments

when participants heard semantically anomalous sentences such as “He spread

the warm bread with socks.” About 400 ms after the anomalous word (socks),

ERP recordings showed a large negative amplitude shift. Second, there is the

P600, which occurs in response to syntactic violations. For instance, Osterhout

and Holcomb (1992) presented their participants with sentences such as “The

368 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 368

broker persuaded to sell the stock” and found a positive wave at about 600 ms

after the word to, which was the point at which there was a violation of the

syntax. Of particular interest in this context is the relation between the N400

and the P600.

Ainsworth-Darnell, Shulman, and Boland (1998) studied how these two

effects combined when participants heard sentences such as

Control: Jill entrusted the recipe to friends before she suddenly disappeared.

Syntactic anomaly: Jill entrusted the recipe friends before she suddenly

disappeared.

Semantic anomaly: Jill entrusted the recipe to platforms before she suddenly

disappeared.

Double anomaly: Jill entrusted the recipe platforms before she suddenly

disappeared.

The last sentence combines a semantic and a syntactic anomaly. Figure 13.7 contrasts

the ERP waveforms obtained from midline and parietal sites in response to

the various types of sentences. An arrow in the ERPs points to the onset of the

Parsing | 369

(a)

CTRL Cz

SYN Cz

SEM Cz

2 Both Cz

–2

–4

–6

1750 2000 2250 2500 2750 3000 3250

N400

P600

(b)

CTRL Pz

SYN Pz

SEM Pz

2 Both Pz

–2

–4

–6

1750 2000 2250 2500 2750 3000 3250

N400

P600

FIGURE 13.7 ERP recordings from (a) central and (b) parietal sites. The arrows point to the

onset of the critical word. (From Ainsworth-Darnell, Shulman, & Boland, 1998.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 369

critical word ( friends or platforms). The two types of sentences containing a

semantic anomaly evoked a negative shift (N400) at the midline site about 400 ms

after the critical word. In contrast, the two types of sentences containing a syntactic

anomaly were associated with a positive shift (P600) in the parietal area about

600 ms after the onset of the critical word.Ainsworth et al. used the fact that each

process—syntactic and semantic—affects a different brain region to argue that

the syntactic and semantic processes are separable.

ERP recordings indicate syntactic and semantic violations elicit different

responses in different locations in the brain.

Ambiguity

Many sentences can be interpreted in two or more ways because of either

ambiguous words or ambiguous syntactic constructions. Examples of such

sentences are

• John went to the bank. • Flying planes can be dangerous.

It is also useful to distinguish between transient ambiguity and permanent ambiguity.

The preceding examples are permanently ambiguous. That is, the ambiguity

remains to the end of the sentence. Transient ambiguity refers to ambiguity

in a sentence that is resolved by the end of the sentence; for example, consider

hearing a sentence that begins as follows:

• The old train . . .

At this point, whether old is a noun or an adjective is ambiguous. If the sentence

continues as follows,

• . . . left the station.

then old is an adjective modifying train. On the other hand, if the sentence continues

as follows,

• . . . the young.

then old is the subject of the sentence and train is a verb. This is an example of

transient ambiguity—an ambiguity in the middle of a sentence for which the

resolution depends on how the sentence ends.

Transient ambiguity is quite prevalent in language, and it leads to a serious

interaction with the principle of immediacy of processing described earlier.

Immediacy of processing implies that we commit to an interpretation of a word

or a phrase right away, but transient ambiguity implies that we cannot always

know the correct interpretation immediately. Consider the following sentence:

• The horse raced past the barn fell.

Most people do a double take on this sentence: they first read one interpretation

and then a second. Such sentences are called garden-path sentences because we

are “led down the garden path” and commit to one interpretation at a certain

point only to discover that it is wrong at another point. For instance, in the

preceding sentence, most readers interpret raced as the main verb of the sentence.

370 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 370

The existence of such garden-path sentences is considered to be one of the

important pieces of evidence for the principle of immediacy of interpretation.

People could postpone interpreting such sentences at points of ambiguity until

the ambiguity is resolved, but they do not.

When one comes upon a point of syntactic ambiguity in a sentence, what

determines its interpretation? A powerful principle is the principle of minimal

attachment. This principle basically says that one interprets a sentence in a way

that causes minimal complication of its phrase structure. Because all sentences

must have a main verb, the simple interpretation would be to include raced in the

main sentence rather than creating a relative clause to modify the noun horse.

Many times we are not aware of the ambiguities that exist in sentences. For

instance, consider the following sentence:

• The woman painted by the artist fell.

As we will see, people seem to have difficulty with this sentence (temporarily

interpreting the woman as the one doing the painting), just like the earlier horse

raced sentence. However, people tend not be aware of taking a garden path in

the way that they are with the horse raced sentence.

Why are we aware of a reinterpretation in some sentences, such as the horse

raced example, but not in others, such as the woman painted example? If a

syntactic ambiguity is resolved quickly after we encounter it, we seem to be

unaware of ever considering two interpretations. Only if resolution is postponed

substantially beyond the ambiguous phrase are we aware of the need to

reinterpret it (Ferriera & Henderson, 1991). Thus, in the woman painted example,

the ambiguity is resolved immediately after the verb painted, and thus most

people are not aware of the ambiguity. In contrast, in the horse raced example,

the sentence seems to successfully complete as The horse raced past the barn

only to have this interpretation contradicted by the last word fell.

When people come to a point of ambiguity in a sentence, they adopt one

interpretation, which they will have to retract if it is later contradicted.

Neural Indicants of the Processing of Transient Ambiguity

Brain-imaging studies reveal a good deal about how people process ambiguous

sentences. In one study, Mason, Just, Keller, and Carpenter (2003) compared

three kinds of sentences:

Unambiguous: The experienced soldiers spoke about the dangers of the

midnight raid.

Ambiguous preferred: The experienced soldiers warned about the dangers

before the midnight raid.

Ambiguous unpreferred: The experienced soldiers warned about the dangers

conducted the midnight raid.

The verb spoke in the first sentence is unambiguous, but the verb warned in the

last two sentences has a transient ambiguity of just the sort described in the preceding

subsection: Until the end of the sentence, one cannot know whether the

Parsing | 371

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 371

372 | Language Comprehension

soldiers are doing the warning or are being warned.

As noted, participants prefer the first interpretation.

Mason et al. collected fMRI measures of activation

in Broca’s area as participants read the sentences.

These data are plotted in Figure 13.8 as a function of

time since the onset of the sentences (which lasted

approximately 6–7 s). As is typical of fMRI measures,

the differences among conditions show up only

after the processing of the sentences, corresponding

to the lag in the hemodynamic response. As can be

seen, the unambiguous sentence results in the least

activation, owing to the greater ease in processing

that sentence. However, in comparing the two ambiguous

sentences, we see that activation is greater

for the sentence that ends in the unpreferred way.

FMRI measures such as those in Figure 13.8 can

localize areas in the brain in which processing is taking

place, in this case confirming the critical role of

Broca’s area in the processing of sentence structure. However, these measures

do not identify the fine-grained temporal structure of the processing. An ERP

study by Frisch, Schlesewsky, Saddy, and Alpermann (2002) investigated the

temporal aspect of how people deal with ambiguity. Their study was with German

speakers and took advantage of the fact that some German nouns are ambiguous

in their role assignment. They looked at German sentences that begin

with either of two different nouns and end with a verb. In the following examples,

each German sentence is followed by a word-by-word translation and then

the equivalent English sentence:

1. Die Frau hatte den Mann gesehen.

The woman had the man seen

The woman had seen the man.

2. Die Frau hatte der Mann gesehen.

The woman had the man seen

The man had seen the woman.

3. Den Mann hatte die Frau gesehen.

The man had the woman seen

The woman had seen the man.

4. Der Mann hatte die Frau gesehen.

The man had the woman seen

The man had seen the woman.

Note that, when participants read Die Frau at the beginning of sentences 1 and 2,

they do not know whether the woman is the subject or the object of the sentence.

Only when they read den Mann in sentence 1 can they infer that man is an

object (because of the determiner den) and hence that woman must be the subject.

Similarly, der Mann in sentence 2 indicates that man is the subject and

0 2

0.5

1.5

2.5

4 6 8 10

Time (seconds)

Change from fixation (%)

12 14

Unambiguous

Ambiguous preferred

Ambiguous unpreferred

FIGURE 13.8 The average

activation change in Broca’s

area for three types of

sentences as a function of time

from beginning of the sentence.

(From Masson et al., 2003.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 372

therefore woman must be the object. Sentences 3 and 4, because they begin with

Mann and its inflected article, do not have this transient ambiguity. The difference

in when one can interpret these sentences depends on the fact that the masculine

article is inflected for case in German but the feminine article is not.

Frisch et al. used the P600 (already described with respect to Figure 13.7) to

investigate the syntactic processing of these sentences. They found that the

ambiguous first noun in sentences 1 and 2 was followed by a stronger P600

than were the unambiguous sentences 3 and 4. The contrast between sentences

1 and 2 also is interesting. Although German allows for either subject-object or

object-subject ordering, the subject-object structure in sentence 1 is preferred. For

the unpreferred sentence (2), Frisch et al. found that the second noun was followed

by a greater P600. Thus, when participants reach a transient ambiguity, as in

sentences 1 and 2, they seem to immediately have to work harder to deal with the

ambiguity. They commit to the preferred interpretation and have to do further

work when they learn that it is not the correct interpretation, as in sentence 2.

Activity in Broca’s area increases when participants encounter a transient

ambiguity and when they have to change an initial interpretation of a

sentence.

Lexical Ambiguity

The preceding discussion was concerned with how participants deal with syntactic

ambiguity. In lexical ambiguity, where a single word has two meanings,

there is often no structural difference in the two interpretations of a sentence.

A series of experiments beginning with Swinney (1979) helped to reveal how

people determine the meaning of ambiguous words. Swinney asked participants

to listen to sentences such as • The man was not surprised when he found several spiders, roaches, and

other bugs in the corner of the room.

Swinney was concerned with the ambiguous word bugs (meaning either insects or

electronic listening devices). Just after hearing the word, participants would be

presented with a string of letters on the screen, and their task was to judge whether

that string made a correct word. Thus, if they saw ant, they would say yes; but

if they saw ont, they would say no. This is the lexical-decision task described

in Chapter 6 in relation to the mechanisms of spreading activation. Swinney was

interested in how the word bugs in the passage would prime the lexical judgment.

The critical contrasts involved the relative times to judge spy, ant, or sew,

following bugs. The word ant is related to the primed meaning of bugs, whereas

spy is related to the unprimed meaning. The word sew defines a neutral control

condition. Swinney found that recognition of either spy or ant was facilitated if

that word was presented within 400 ms of the prime, bugs. Thus, the presentation

of bugs immediately activates both of its meanings and their associations.

If Swinney waited more than 700 ms, however, only the related word ant was

facilitated. It appears that a correct meaning is selected in this time and the

other meaning becomes deactivated. Thus, two meanings of an ambiguous

Parsing | 373

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 373

word are momentarily active, but context operates very rapidly to select the appropriate

meaning.

When an ambiguous word is presented, participants select a particular

meaning within 700 ms.

Modularity Compared with Interactive Processing

There are two bases by which people can disambiguate ambiguous sentences.

One possibility is the use of semantics, which is the basis for disambiguating

the word bugs in the sentence given in the preceding subsection. The other possibility

is the use of syntax. Advocates of the language-modularity position (see

Chapter 12) have argued that there is an initial phase in which we merely

process syntax, and only later do we bring semantic factors to bear. Thus, initially

only syntax is available for disambiguation, because syntax is part of a

language-specific module that can operate quickly by itself. In contrast, to bring

semantics to bear requires using all of one’s world knowledge, which goes far

beyond anything that is language specific. Opposing the modularity position is

that of interactive processing, the proponents of which argue that syntax and

semantics are combined at all levels of processing.

Much of the debate between these two positions has concerned the processing

of transient syntactic ambiguity. In the initial study of what has become a

long series of studies, Ferreira and Clifton (1986) asked participants to read

sentences such as

1. The woman painted by the artist was very attractive to look at.

2. The woman that was painted by the artist was very attractive to look at.

3. The sign painted by the artist was very attractive to look at.

4. The sign that was painted by the artist was very attractive to look at.

Sentences 1 and 3 are called reduced relatives because the relative pronoun that is

missing. There is no local syntactic basis for deciding whether the noun-verb

combination is a relative clause construction or an agent-action combination.

Ferreira and Clifton argued that, because of the principle of minimal attachment,

people have a natural tendency to encode noun-verb combinations such as The

woman painted as agent-action combinations. Evidence for this tendency is that

participants take longer to read by the artist in the first sentence than in the

second. The reason is that they discover that their agent-action interpretation is

wrong in the first sentence and have to recover, whereas the syntactic cue that was

in the second sentence prevents them from ever making this misinterpretation.

The real interest in the Ferreira and Clifton experiments is in sentences 3

and 4. Semantic factors should rule out the agent-action interpretation of

sentence 3, because a sign cannot be an animate agent and engage in painting.

Nonetheless, participants took just as long to read by the artist in sentence 3 as

in sentence 1 and longer than in unambiguous sentences 2 or 4. Thus, argued

Ferreira and Clifton, participants first use only syntactic factors and so misinterpret

the phrase The sign painted and then use the syntactic cues in the phrase

374 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 374

by the artist to correct that misinterpretation. Thus, although semantic factors

could have done the job and prevented the misinterpretation, participants

seemingly do all their initial processing by using syntactic cues.

Experiments of this sort have been used to argue for the modularity of language.

The argument is that our initial processing of language makes use of

something specific to language—namely, syntax—and ignores other general,

nonlinguistic knowledge that we have of the world, for example, that signs cannot

paint. However, Trueswell, Tannehaus, and Garnsey (1994) argued that

many of the sentences in the Ferreira and Clifton study were not like sentence 3.

Specifically, although the sentences were supposed to have a semantic basis for

disambiguation, many did not. For instance, among the Ferreira and Clifton

sentences were sentences such as

5. The car towed from the parking lot was parked illegally.

Here car towed was supposed to be unambiguous, but it is possible for car to be

the subject of towed as in

6. The car towed the smaller car from the parking lot.

When Trueswell et al. used sentences that avoided these problems, they found

that participants did not have any difficulty with the sentences. For instance,

participants showed no more difficulty with

7. The evidence examined by the lawyer turned out to be unreliable.

than with

8. The evidence that was examined by the lawyer turned out to be unreliable.

Thus, people do seem to be able to select the correct interpretation when it is not

semantically possible to interpret the noun (evidence) as an agent of the verb.

Thus, the initial syntactic decisions are not made without reference to semantic

factors.

Additionally, McRae, Spivey-Knowlton, and Tannehaus (1998) show that

the relative plausibility of the noun as agent of the verb affects the difficulty of

the construction. They compared the following pairs of sentences:

9. The cop arrested by the detective was guilty of taking bribes.

10. The cop that was arrested by the detective was guilty of taking bribes.

and

11. The crook arrested by the detective was guilty of taking bribes.

12. The crook that was arrested by the detective was guilty of taking bribes.

They found that participants suffered much greater difficulty with the reduced

relatives in sentence 9, where the subject cop is plausible as the agent for arresting,

than in sentence 11, where the subject crook is not.

Participants appear to be able to use semantic information immediately

to guide syntactic decisions.

Parsing | 375

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 375

376 | Language Comprehension

which is the computer. Try to judge which snippet involves

the human and which involves Elbot.

Snippet A

Interrogator: What do you think of this year’s Turner

Prize shortlist?

Respondent: Difficult question. I will have to work on that

and get back to you tomorrow.

Interrogator: You sound rather like Sarah Palin.

Respondent: Did this

Palin ever write a telephone

book?

Interrogator: I doubt

it. Does anyone really

write telephone books?

I don’t suppose you

caught the football last

night?

Respondent: What a

shame. There are many

talented authors who waste their energies on prose or

run-of-the-mill poetry. But to write a telephone book is

highest art!

Snippet B

Interrogator: What do you think of this year’s Turner

Prize shortlist?

Respondent: good I think. Better than the years before

i herad

Interrogator: Which was your favourite?

Respondent: It’s my first time, so I can’t compare

Interrogator: No, I mean of the entries this time around.

Respondent: Not really sure

Implications

Intelligent chatterboxes

Being able to communicate successfully in language is very

much tied to our conception of human intelligence. For instance,

Alan Turing (a famous British logician who broke

the secret communication code of the Nazis during World

War II) proposed in 1950 (Turing, 1950) that we decide

whether a machine is intelligent by whether it can engage

in a conversation that convinces the listener that it is a

human. In what has come to be known as the Turing Test,

a judge would interact with a human and a computer over

a chat system (to eliminate

visual cues). If, after

conversing with both,

the judge could not

determine which was

human and which was

computer, the computer

would be declared to be

intelligent. Turing predicted

that by the year

2000 a computer would

be able to pass this test.

In 1990, the Loebner Prize was created for the first

computer that could pass the Turing test. Each year a

contest is held in which various computer entries are

judged. A bronze prize is awarded yearly to the program

that gives the most convincing conversation, but so far

no machine has been able to fool a majority of the judges,

which would result in the silver prize (the gold prize is

reserved for something that even looks like a human).

The winner in 2008, a program called Elbot, came close

to winning the silver prize, fooling 3 of the 12 judges. It

even deceived reporter Will Pavia of The Times (http://

technology.timesonline.co.uk/tol/news/tech_and_web/

article4934858.ece). Below are two small snippets of

conversation between an interrogator with a human and

with Elbot. I have not identified which is the human and

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 376

•Utilization

After a sentence has been parsed and mapped into a representation of its meaning,

what then? A listener seldom passively records the meaning. If the sentence

is a question or an imperative, for example, the speaker will expect the listener

to take some action in response. Even for declarative sentences, moreover, there

is usually more to be done than simply registering the sentence. Fully understanding

a sentence requires making inferences and connections. In Chapter 6,

we considered the way in which such elaborative processing leads to better

memory. Here, we will review some of the research on how people make such

inferences.

Bridging versus Elaborative Inferences

In understanding a sentence, the comprehender must make inferences that go

beyond what is stated. Researchers typically distinguish between bridging inferences

(also called backward inferences) and elaborative inferences (also called

forward inferences). Bridging inferences reach back in the text to make connections

with earlier parts of the text. These elaborative inferences add new

information to the interpretation of the text and often predict what will be

coming up in the text. To illustrate the difference between bridging and elaborative

inferences, contrast the following pairs of sentences used by Singer (1994):

1. Direct statement: The dentist pulled the tooth painlessly. The patient

liked the method.

2. Bridging inference: The tooth was pulled painlessly. The dentist used a

new method.

3. Elaborative inference: The tooth was pulled painlessly. The patient liked

the new method.

Having been presented with these sentence pairs, participants were asked

whether it was true that A dentist pulled the tooth. This is explicitly stated in example

1, but it is also highly probable in examples 2 and 3, even though it is not

stated. The inference that the dentist pulled the tooth in example 2 is required

to connect dentist in the second sentence to the first and so would be classified

as a backward bridging inference. The inference in example 3 is an elaboration

(because a dentist is not mentioned in either sentence) and so would be classified

as a forward elaborative inference. Participants were equally fast to verify A

dentist pulled the tooth in the bridging inference condition of example 2 as they

were in the direct condition of example 1, indicating that they made the bridging

inference. However, they were about a quarter of a second slower to verify

the sentence in the elaborative-inference condition of example 3, indicating

that they had not made the elaborative inference.

The problem with elaborative inferences is that there are no bounds on how

many such inferences can be made. Consider the sentence The tooth was pulled

painlessly. In addition to inferring who pulled the tooth, one could make

Utilization | 377

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 377

inferences about what instrument was used to make the extraction, why the

tooth was pulled, why the procedure was painless, how the patient felt, what

happened to the patient afterward, which tooth (e.g., incisor or molar was

pulled), how easy the extraction was, and so on. Considerable research has been

undertaken in trying to determine exactly which elaborative inferences are

made (Graesser, Singer, & Trabasso, 1994). In the Singer (1994) study just described,

the elaborative inference seems not to have been made. As an example

of a study in which an elaborative inference seems to have been made, consider

the experiment reported by Long, Golding, and Graesser (1992). They had participants

read a story that included the following critical sentence: • A dragon kidnapped the three daughters.

After reading this sentence, participants made a lexical decision about the

word eat (a lexical decision task, discussed in earlier in this chapter and in

Chapter 6, involves deciding whether a string of letters makes a word). Long et

al. found that participants could make the lexical decision more rapidly after

reading this sentence than in a neutral context. From this data, they argued

that participants made the inference that the dragon’s goal was to eat the

daughters (which had not been directly stated or even suggested in the story).

Long et al. argued that, when reading a story, we normally make inferences

about a character’s goals.

Although bridging inferences are made automatically, it is optional whether

people will make elaborative inferences. It takes effort to make these inferences

and readers need to be sufficiently engaged in the text they are reading to make

them. It also appears to depend on reading ability. For instance, in one study

Murray and Burke (2003) had participants read passages like

Carol was fed up with her job waiting on tables. Customers were rude, the

chef was impossibly demanding, and the manager had made a pass at her just

that day. The last straw came when a rude man at one of her tables complained

that the spaghetti she had just served was cold. As he became louder

and nastier, she felt herself losing control.

The passage then ended with one of the following two sentences:

Experimental:Without thinking of the consequences, she picked up the plate

of spaghetti and raised it above the customer’s head.

Or

Control: To verify the complaint, she picked up the plate of spaghetti and

raised it above the customer’s head.

After reading this sentence, participants were presented with a critical word like

“dump” which is related to an elaborative inference that readers would only

make in the experimental condition. They simply had to read the word. Participants

classified as having high reading ability read the word “dump” faster in

the experimental condition, indicating they had made the inference. However,

low-reading-ability participants did not. Thus, it would appear that high-ability

readers had made the elaborative inference that Carol was going to dump the

spaghetti on the customer’s head, whereas the low-ability readers had not.

378 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 378

In understanding a sentence, listeners make bridging inferences to connect

it to prior sentences but only sometimes make elaborative inferences that

connect to possible future material.

Inference of Reference

An important aspect of making a bridging inference consists of recognizing

when an expression in the sentence refers to something that we should already

know. Various linguistic cues indicate that an expression is referring to something

that we already know. One cue in English turns on the difference between

the definite article the and the indefinite article a. The tends to be used to signal

that the comprehender should know the reference of the noun phrase,

whereas a tends to be used to introduce a new object. Compare the difference

in meaning of the following sentences:

1. Last night I saw the moon.

2. Last night I saw a moon.

Sentence 1 indicates a rather uneventful fact—seeing the same old moon as

always—but sentence 2 carries the clear implication of having seen a new moon.

There is considerable evidence that language comprehenders are quite sensitive

to the meaning communicated by this small difference in the sentences. In one

experiment, Haviland and Clark (1974) compared participants’ comprehension

time for two-sentence pairs such as

3. Ed was given an alligator for his birthday. The alligator was his favorite

present.

4. Ed wanted an alligator for his birthday. The alligator was his favorite

present.

Both pairs have the same second sentence. Pair 3 introduces in its first sentence

a specific antecedent for the alligator. On the other hand, although alligator is

mentioned in the first sentence of pair 4, a specific alligator is not introduced.

Thus, there is no antecedent in the first sentence of pair 4 for the alligator. The

definite article the in the second sentence of both pairs supposes a specific

antecedent. Therefore, we would expect that participants would have difficulty

with the second sentence in pair 4 but not in pair 3. In the Haviland and Clark

experiment, participants saw pairs of such sentences one at a time. After they

comprehended each sentence, they pressed a button. The time was measured

from the presentation of the second sentence until participants pressed a button

indicating that they understood that sentence. Participants took an average

of 1031 ms to comprehend the second sentence in pairs, such as pair 3, in

which an antecedent was given, but they took an average of 1168 ms to comprehend

the second sentence in pairs, such as pair 4, in which there was no

antecedent for the definite noun phrase. Thus, comprehension took more than

a tenth of a second longer when there was no antecedent.

The results of an experiment done by Loftus and Zanni (1975) showed that

choice of articles could affect listeners’ beliefs. These experimenters showed

Utilization | 379

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 379

participants a film of an automobile accident and asked them a series of questions.

Some participants were asked,

5. Did you see a broken headlight?

Other participants were asked,

6. Did you see the broken headlight?

In fact, there was no broken headlight in the film, but question 6 uses a definite

article, which supposes the existence of a broken headlight. Participants were

more likely to answer “Yes” when asked the question in form 6. As Loftus and

Zanni noted, this finding has important implications for the interrogation of

eyewitnesses.

Comprehenders take the definite article the to imply the existence of a

reference for the noun.

Pronominal Reference

Another aspect of processing reference concerns the interpretation of pronouns.

When one hears a pronoun such as she, deciding who is being referenced

is critical. A number of people may have already been mentioned, and all

are candidates for the reference of the pronoun. As Just and Carpenter (1987)

noted, there are a number of bases for resolving the reference of pronouns:

1. One of the most straightforward is to use number or gender cues. Consider • Melvin, Susan, and their children left when (he, she, they) became sleepy.

Each possible pronoun has a different referent.

2. A syntactic cue to pronominal reference is that pronouns tend to refer to

objects in the same grammatical role (e.g., subject versus object). Consider • Floyd punched Bert and then he kicked him.

Most people would agree that the subject he refers to Floyd and the object him

refers to Bert.

3. There is also a strong recency effect such that the most recent candidate

referent is preferred. Consider • Dorothea ate the pie; Ethel ate cake; later she had coffee.

Most people would agree that she probably refers to Ethel.

4. Finally, people can use their knowledge of the world to determine

reference. Compare • Tom shouted at Bill because he spilled the coffee. • Tom shouted at Bill because he had a headache.

Most people would agree that he in the first sentence refers to Bill because you

tend to scold people who make mistakes, whereas he in the second sentence

refers to Tom because people tend to be cranky when they have headaches.

In keeping with the immediacy-of-interpretation principle articulated earlier,

people try to determine who a pronoun refers to immediately upon encountering

380 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 380

it. For instance, in studies of eye fixations (Carpenter & Just, 1977; Ehrlich &

Rayner, 1983; Just & Carpenter, 1987), researchers found that people fixated on a

pronoun longer when it is harder to determine its reference. Ehrlich and Rayner

(1983) also found that participants’ resolution of the reference tends to spill over

into the next fixation, suggesting they are still processing the pronoun while reading

the next word.

Corbett and Chang (1983) found evidence that participants consider multiple

candidates for a referent. They had participants read sentences such as • Scott stole the basketball from Warren and he sank a jumpshot.

After reading the sentence, participants saw a probe word and had to decide

whether the word appeared in the sentence. Corbett and Chang found that time

to recognize either Scott or Warren decreased after reading such a sentence.

They also asked participants to read the following control sentence, which did

not require the referent of a pronoun to be determined: • Scott stole the basketball from Warren and Scott sank a jumpshot.

In this case, only recognition of Scott was facilitated. Warren was facilitated

only in the first sentence because, in that sentence, participants had to consider

it a possible referent of he before settling on Scott as the referent.

The results of both the Corbett and Chang study and the Ehrlich and Rayner

study indicate that resolution of pronoun reference lasts beyond the reading of

the pronoun itself. This finding indicates that processing is not always as immediate

as the immediacy-of-processing principle might seem to imply. The processing

of pronominal reference spills over into later fixations (Ehrlich & Rayner,

1983), and there is still priming for the unselected reference at the end of the

sentence (Corbett & Chang, 1983).

Comprehenders consider multiple possible candidates for the referent of a

pronoun and use syntactic and semantic cues to select a referent.

Negatives

Negative sentences appear to suppose a positive sentence and then ask us to infer

what must be true if the positive sentence is false. For instance, the sentence

John is not a crook supposes that it is reasonable to assume John is a crook but

asserts that this assumption is false. As another example, imagine the following

four replies from a normally healthy friend to the question How are you feeling?

1. I am well.

2. I am sick.

3. I am not well.

4. I am not sick.

Replies 1 through 3 would not be regarded as unusual linguistically, but reply 4

does seem peculiar. By using the negative, reply 4 is supposing that thinking of

our friend as sick is reasonable. Why would we think our friend is sick, and

what is our friend really telling us by saying it is not so? In contrast, the negative

in reply 3 is easy to understand, because supposing that the friend is normally

well is reasonable and our friend is telling us that this is not so.

Utilization | 381

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 381

Clark and Chase (Chase & Clark, 1972; H. H. Clark, 1974; Clark & Chase,

1972) conducted a series of experiments on the verification of negatives (see

also Carpenter & Just, 1975; Trabasso, Rollins, & Shaughnessy, 1971). In a typical

experiment, they presented participants with a card like that shown in

Figure 13.9 and asked them to verify one of four sentences about this card:

1. The star is above the plus—true affirmative.

2. The plus is above the star—false affirmative.

3. The plus is not above the star—true negative.

4. The star is not above the plus—false negative.

The terms true and false refer to whether the sentence is true of the picture; the

terms affirmative and negative refer to whether the sentence structure has a

negative element. Sentences 1 and 2 are simple assertions, but sentences 3 and

4 contain a supposition plus a negation of the supposition. Sentence 3 supposes

that the plus is above the star and asserts that this supposition is false; sentence 4

supposes that the star is above the plus and asserts that this supposition is false.

Clark and Chase assumed that participants would check the supposition first

and then process the negation. In sentence 3, the supposition does not match the

picture, but in sentence 4, the supposition does match the picture. Assuming

that mismatches would take longer to process, Clark and Chase predicted that

participants would take longer to respond to sentence 3, a true negative, than

to sentence 4, a false negative. In contrast, participants should take longer to

process sentence 2, the false affirmative, than sentence 1, the true affirmative,

because sentence 2 does not match the picture. In fact, the difference between

sentences 2 and 1 should be identical with the difference between sentences 3

and 4, because both differences correspond to the extra time due to a mismatch

between the sentence and the picture.

Clark and Chase developed a simple and elegant mathematical model for such

data. They assumed that processing sentences 3 and 4 took N time units longer

than did processing sentences 1 and 2 because of the more complex suppositionplus-

negation structure of sentences 3 and 4. They also assumed that processing

sentence 2 tookMtime units longer than did processing sentence 1 because of the

mismatch between picture and assertion. Similarly, they assumed that processing

sentence 3 tookMtime units longer than did processing sentence 4 because of the

mismatch between picture and supposition. Finally, they assumed that processing

a true affirmative such as sentence 1 took T time units. The time T refers to the

time used in processes exclusive of negation or the picture mismatch. Let us consider

the total time that participants should spend processing a sentence such as

sentence 3: This sentence has a complex supposition-and-negation structure,

which costs N time units, and a supposition mismatch, which costs M time units.

Therefore, total processing time should be T _M_ N. Table 13.1 shows both the

observed data and the reaction-time predictions that can be derived for the Clark

and Chase experiment. The best predicting values for T, M, and N for this

experiment can be estimated from the data as T _ 1469 ms, M _ 246 ms, and

N _ 320 ms. As you can confirm, the predictions match the observed time

remarkably well. In particular, the difference between true negatives and false

negatives is close to the difference between false affirmatives and true affirmatives.

382 | Language Comprehension

FIGURE 13.9 A card like that

presented to participants in Clark

and Chase’s sentence-verification

experiments. Participants were to

say whether simple affirmative

and negative sentences correctly

described these patterns.

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 382

This finding supports the hypothesis that participants do extract the suppositions

of negative sentences and match them to the picture.

Comprehenders process a negative by first processing its embedded supposition

and then the negation.

•Text Processing

So far, we have focused on the comprehension of single sentences in isolation.

Sentences are more frequently processed in larger contexts; for example, in the

reading of a textbook. Texts, like sentences, are structured according to certain

patterns, although these patterns are perhaps more flexible than those for sentences.

Researchers have noted that a number of recurring relations serve to

organize sentences into larger parts of a text. Some of the relations that have

been identified are listed in Table 13.2. These structural relations specify the way

in which a sentence should be related to the overall text. For instance, the first

text structure (response) in Table 13.2 directs the reader to relate one set of

sentences as part of the solution to problems posed by other sentences. These

relations can be at any level of a text. That is, the main relation organizing a

Text Processing | 383

TABLE 13.1

Observed and Predicted Reaction Times in Experiment Verification

Condition Observed Time Equation Predicted Time

True affirmative 1463 ms T 1469 ms

False affirmative 1722 ms T _ M 1715 ms

True negative 2028 ms T _ M _ N 2035 ms

False negative 1796 ms T _ N 1789 ms

TABLE 13.2

Possible Types of Relations among Sentences in a Text

Type of Relation Description

1. Response A question is presented and an answer follows or a problem is

presented and a solution follows.

2. Specific Specific information is given subsequent to a more general point.

3. Explanation An explanation is given for a point.

4. Evidence Evidence is given to support a point.

5. Sequence Points are presented in their temporal sequenceas a set.

6. Cause An event is presented as the cause of another event.

7. Goal An event is presented as the goal of another event.

8. Collection A loose structure of points is presented. (This is perhaps a case

in which there is no real organizing relation.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 383

paragraph might be any of the eight in Table 13.2. Subpoints in a paragraph

also may be organized according to any of these relations.

To see how the relations in Table 13.2 might be used, consider Meyer’s

(1974) now-classic analysis of the following paragraph:

Parakeet Paragraph

The wide variety in color of parakeets that are available on the market today resulted

from careful breeding of the color mutant offspring of green-bodied and

yellow-faced parakeets. The light green body and yellow face color combination

is the color of the parakeets in their natural habitat, Australia. The first living

parakeets were brought to Europe from Australia by John Gould, a naturalist, in

1840. The first color mutation appeared in 1872 in Belgium; these birds were

completely yellow. The most popular color of parakeets in the United States is

sky-blue. These birds have sky-blue bodies and white faces; this color mutation

occurred in 1878 in Europe. There are over 66 different colors of parakeets listed

by the Color and Technical Committee of the Budgerigar Society. In addition to

the original green-bodied and yellow-faced birds, colors of parakeets include

varying shades of violets, blues, grays, greens, yellows, and whites. (p. 61)

Her analysis of this paragraph is approximately reproduced in Table 13.3. Note

that this analysis tends to organize various facts into more or less major points.

384 | Language Comprehension

TABLE 13.3

Analysis of the Parakeet Paragraph

1. A explains B.

A. There was careful breeding of color mutants of green-bodied and yellow-faced

parakeets. The historical sequence is

1. Their natural habitat was Australia. Specific detail:

a. Their color here is a light-green body and yellow-face combination.

2. The first living parakeets were brought to Europe from Australia by John

Gould in 1840. Specific detail:

a. John Gould was a naturalist.

3. The first color mutation appeared in 1872 in Belgium. Specific detail:

a. These birds were completely yellow.

4. The sky-blue mutation occurred in 1878 in Europe. Specific details:

a. These birds have sky-blue bodies and white faces.

b. This is the most popular color in America.

B. There is a wide variety in color of parakeets that are on the market today.

Evidence for this is

1. There are over 66 different colors of parakeets listed by the Color and

Technical Committee of the Budgerigar Society.

2. There are many available colors. A collection of these is

a. The original green-bodied and yellow-faced birds

b. Violets

c. Blues

d. Grays

e. Greens

f. Yellows

g. Whites

From Meyer (1974).

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 384

The highest-level organizing relation in this paragraph is explanation (see

item 3, Table 13.2). Specifically, the major points in this explanation are that

(point A) there has been careful breeding of color mutants and (point B) there

is a wide variety of parakeet color, and point A is given as an explanation of

point B. Organized under point A are some events from the history of parakeet

breeding. This organization is an example of a sequence relation. Organized

under these events are specific details. So, for instance, organized under A2 is

the fact that John Gould was a naturalist. Organized under point B is evidence

supporting the assertion about the wide color variety and some details about

the available variation in color.

The propositions in a text can be organized hierarchically according to

various semantic relations.

Text Structure and Memory

A great deal of research has demonstrated the psychological significance of text

structure. A number of hypotheses differ in regard to exactly what system of

relations should be used in the analysis of texts, but they generally agree that

some sort of hierarchical structure organizes the propositions of a text.Memory

experiments have yielded evidence that participants do, to some degree, respond

to that hierarchical structure.

Meyer, Brandt, and Bluth (1978) studied students’ perception of the highlevel

structure of a text—that is, the structural relations at the higher levels

of hierarchies like that in Table 13.3. They found considerable variation in

participants’ ability to recognize the high-level structure that organized a text.

Moreover, they found that participants’ ability to identify the top-level structure

of a text was an important predictor of their memory for the text. In

another study, on ninth-graders, Bartlett (1978) found that only 11% of participants

consciously identified and used high-level structure to remember text

material. This select group did twice as well as other students on their recall

scores. Bartlett also showed that training students to identify and use top-level

structure more than doubled recall performance.

In addition to its hierarchical structure, a text tends to be held together

by causal and logical structures. This tendency is clearest in narratives in

which one event in a sequence of events causes the next event. The scripts

discussed in Chapter 5 are one kind of knowledge structure that is designed

to encode such causal relations. Often the causal links are not explicitly

stated but rather have to be inferred. For instance, we might hear on a

newscast

• There is an accident on the Parkway East. Traffic is being rerouted

through Wilkinsburg.

It is left to the listener to infer that the first fact is the cause of the second fact.

Keenan, Baillet, and Brown (1984) studied the effect of the probability of the

causal relation connecting two sentences on the processing of the second sentence.

Text Processing | 385

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 385

They asked participants to read pairs of sentences, of which the first might be one

of the following sentences:

1a. Joey’s big brother punched him again and again.

1b. Racing down the hill, Joey fell off his bike.

1c. Joey’s crazy mother became furiously angry with him.

1d. Joey went to a neighbor’s house to play.

Keenan et al. were interested in the effect of the first sentence on time to read a

second sentence such as

2. The next day, his body was covered with bruises.

Sentences 1a through 1d are ordered in decreasing probability of a causal connection

to the second sentence. Correspondingly, Keenan et al. found that participants’

reading times for sentence 2 increased from 2.6 s when preceded by

high probable causes such as that given in sentence 1a to 3.3 s when preceded

by low probable causes such as that given in sentence 1d. Thus, it takes longer

to understand a more distant causal relation.

There also are effects of causal relatedness on recall. Those parts of a story

that are more central to its causal structure are more likely to be recalled

(Black & Bern, 1981; Trabasso, Secco, & van den Broek, 1984). For instance,

Black and Bern had participants study stories that included pairs of sentences

such as

• The cat leapt up on the kitchen table. • Fred picked up the cat and put it outside.

which are causally related. They contrasted this pair with pairs of sentences such as

• The cat rubbed against the kitchen table. • Fred picked up the cat and put it outside.

which are less plausibly connected by a causal relation. Although the second

sentence is identical in both cases, participants displayed better memories for

the first sentence of a causally related pair.

Thorndyke (1977) also showed that memory for a story is poorer if the

organization of the text conflicts with what would be considered its “natural”

structure. Some participants studied an original story, whereas other participants

studied the story with its sentences presented in a scrambled order. Participants

were able to recall 85% of the facts in the original story but only 32%

of the facts in the scrambled story.

Mandler and Johnson (1977) showed that children have much more difficulty

than adults do when recalling the causal structure of a story. Adults

recall events and the outcomes of those events together, whereas children

recall the outcomes but tend to forget how they were achieved. For instance,

children might recall from a particular story that the butter melted but might

forget that it melted because it was in the sun. Adults do not have trouble

with such simple causal structures, but they may have difficulty perceiving

386 | Language Comprehension

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 386

the more complex relations connecting parts of a text. For instance, how easy

is it for you to specify the relation that connects this paragraph to the preceding

one?

Palinscar and Brown (1984) developed a training program that specifically

trains children to identify and formulate questions about such things

as the causal structure of text. They were able to raise poor-performing

seventh-graders from the 20th to the 56th percentile in reading comprehension.

This result is similar to that obtained by Bartlett (1978), who improved

reading performance by training students to identify the hierarchical structure

of text.

Memory for textual material is sensitive to the hierarchical and causal

structure of that text and tends to be better when people attend to that

structure.

Levels of Representation of a Text

Kintsch (1998) has argued that a text is represented at multiple levels. For instance,

consider the following pair of sentences taken from an experimental

story entitled “Nick Goes to the Movies.”

• Nick decided to go to the movies. He looked at a newspaper to see what

was playing.

Kintsch argues that this material is represented at three levels:

1. There is the surface level of representation of the exact sentences. This

can be tested by comparing people’s ability to remember the exact sentences

versus paraphrases like “Nick studied the newspaper to see what

was playing.”

2. There is also a propositional level (see Chapter 5) and this can be tested

by seeing whether people remember that Nick read the newspaper at all.

3. There is a situation model that consists of the major points of the story.

Thus, we can see whether people remember that “Nick wanted to see a

film”—something not said in the story but strongly implied.

In one study, Kintsch,Welsch, Schmalhofer, and Zimny (1990) looked at participants’

ability to remember these different sorts of information over periods of

time ranging up to 4 days. The results are shown in Figure 13.10. As we saw in

Chapter 5, surface information is forgotten quite rapidly, whereas the propositional

information is better retained. However, the most striking retention

function involves the situation information. After 4 days, participants have forgotten

half the propositions but still remember perfectly what the story was

about. This fits with many people’s experience in reading novels or seeing movies.

They will quickly forget many of the details but will still remember what the

novel or movie was about months later.

Text Processing | 387

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 387

388 | Language Comprehension

When people follow a story, they construct a high-level situation model of the

story that is more durable than the memory for the surface sentences or the

propositions that made up the story.

•Conclusions

The number and diversity of topics covered in this chapter testify to the impressive

cumulative progress in understanding language comprehension. It is fair to

say that we knew almost nothing about language processing when cognitive

psychology emerged from the collapse of behaviorism 50 years ago. Now, we

have a rather articulate picture of what is happening in scales that range from

100 ms after a word is heard to minutes later when large stretches of complex

text must be integrated. Research on language processing turns out to harbor a

number of theoretical controversies, some of which have been discussed in this

review of the field (e.g., whether early syntactic processing is separate from the

rest of cognition). However, such controversies should not blind us to the impressive

progress that has been made. The heat in the field has also generated

much light.

_0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Delay

Trace strength

40 min 2 ds. 4 ds.

Situation

Proposition

Surface

FIGURE 13.10 Memory for a story as a function of time: strengths of the traces for the surface

form of sentences, the propositions that make up the story, and the high-level situation representation.

(From Kintsch et al., 1990.)

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 388

Key Terms | 389

1. Answer the following question: “How many animals

of each kind did Moses take on the ark?” If you are like

most people you answer “two” and did not even notice

that it was Noah and not Moses who took the animals

on the ark (Erickson & Matteson, 1981). People do

this even when they are warned to look out for such

sentences and not answer them (Reder & Kusbit, 1991).

This phenomenon has been called the Moses illusion

even though it has been demonstrated with a wide

range of words besides Moses.What does the Moses

illusion say about how people incorporate the meaning

of individual words into sentences?

2. Christianson, Hollingworth, Halliwell, and Ferreira

(2001) found that when people read the sentence

“While Mary bathed the baby played in the crib” most

people actually interpret the sentence as implying that

Mary bathed the baby. Ferreirra and Patson (2007)

argue that this implies that people do not carefully

parse sentences but settle on “good enough” interpretations.

If people don’t carefully process sentences, what

does that imply about the debate between proponents

of interactive processing and of the modularity

position about how people understand sentences like

“The woman painted by the artist was very attractive

to look at”?

3. Palinscar and Brown (1984) found that teaching

seventh-graders to make elaborative inferences while

reading dramatically improved their reading skills. Do

you think they would have been as successful if they

had focused on teaching children to make bridging

inferences?

4. Bielock, Lyons,Mattarella-Micke, Nusbaum, and Small

(2008) looked at brain activation while participants

listened to sentences about hockey versus other action

sentences. They found greater activation in the premotor

cortex for hockey sentences only for those participants

who were hockey fans.What does this say about the

role of expertise in making elaborative inferences and

developing situation models?

Questions for Thought

Key Terms

bridging (or backward)

inferences

center-embedded

sentences

constituent

elaborative (or forward)

inferences

garden-path sentence

immediacy of interpretation

interactive processing

N400

P600

parsing

principle of minimal

attachment

situation model

transient ambiguity

utilization

Anderson7e_Chapter_13.qxd 8/20/09 9:54 AM Page 389

390

14

Individual Differences

in Cognition

Clearly, all people do not think alike. There are many aspects of cognition, but

humans, naturally being an evaluative species, tend to focus on ways in which

some people perform “better” than other people. This performance is often identified

with the word intelligence—some people are perceived to be more intelligent

than others. Chapter 1 identified intelligence as the defining feature of the human

species. So, to call some members of our species more intelligent than others

can be a potent claim. As we will see, the complexity of human cognition makes

the placement of people on a unidimensional evaluative scale of intelligence

impossible.

This chapter will explore individual differences in cognition both because of its

inherent interest and because it sheds some light on the general nature of human

cognition. The big debate that will be with us throughout this chapter is the nurtureversus-

nature debate. Are some people better at some cognitive tasks because

they are innately endowed with more capacity for those kinds of tasks or because

they have learned more knowledge relevant to these tasks? The answer, not surprisingly,

is that it is some of both, and we will consider and examine some of

the ways in which both basic capacities and experiences contribute to human

intelligence.

More specifically, this chapter will answer the following questions: • How does the thinking of children develop as they mature? • What are the relative contributions of neural growth versus experience

to children’s intellectual development? • What happens to our intellectual capacity through the adult years? • What do intelligence tests measure? • What are the different subcomponents of intelligence?

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 390

•Cognitive Development

Part of the uniqueness of the human species concerns the way in which children

are brought into the world and develop to become adults. Humans have

very large brains in relation to their body size, which created a major evolutionary

problem: How would the birth of such large-brained babies be physically

possible? One way was through progressive enlargement of the birth canal,

which is now as large as is considered possible given the constraints of mammalian

skeletons (Geschwind, 1980). In addition, a child is born with a skull that

is sufficiently pliable for it to be compressed into a cone shape to fit through the

birth canal. Still, the human birth process is particularly difficult compared with

that of most other mammals.

Figure 14.1 illustrates the growth of the human brain during gestation. At

birth, a child’s brain has more neurons than an adult brain has, but the state of

development of these neurons is particularly immature. Compared with those

of many other species, the brains of human infants will develop much more

after birth. At birth, a human brain occupies a volume of about 350 cubic

centimeters (cm3). In the first year of life, it doubles to 700 cm3, and before a

human being reaches puberty, the size of its brain doubles again. Most other

mammals do not have as much growth in brain size after birth (S. J. Gould,

1977). Because the human birth canal has been expanded to its limits, much

of our neural development has been postponed until after birth.

Brain Structures

Brain

stem

Forebrain

Hindbrain

Midbrain

(a) 25 days (b) 50 days (c) 100 days

(d) 20 weeks (e) 28 weeks (f ) 36 weeks (full term)

Neural tube

(forms spinal cord)

FIGURE 14.1 Changes in

structure in the developing

brain. (Adapted from Cowan, 1997,

p. 116).

Cognitive Development | 391

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 391

Even though they spend 9 months developing in the womb, human infants

are quite helpless at birth and spend an extraordinarily long time growing to

adult stature—about 15 years, which is about a fifth of the human life span. In

contrast, a puppy, after a gestation period of just 9 weeks, is more capable at

birth than a human newborn. In less than a year, less than a tenth of its life

span, a dog has reached full size and reproductive capability.

Childhood is prolonged more than would be needed to develop large brains.

Indeed, the majority of neural development is complete by age 5. Humans are

kept children by the slowness of their physical development. It has been speculated

that the function of this slow physical development is to keep children in a

dependency relation to adults (de Beer, 1959).Much has to be learned to become

a competent adult, and staying a child for so long gives the human enough time

to acquire that knowledge. Childhood is an apprenticeship for adulthood.

Modern society is so complex that we cannot learn all that is needed by simply

associating with our parents for 15 years. To provide the needed training,

society has created social institutions such as high schools, colleges, and postcollege

professional schools. It is not unusual for people to spend more than

25 years, almost as long as their professional lives, preparing for their roles in

society.

Human development to adulthood is longer than that of other mammals to

allow time for growth of a large brain and acquisition of a large amount of

knowledge.

Piaget’s Stages of Development

Developmental psychologists have tried to understand the intellectual changes

that take place as we grow from infancy through adulthood. Many have been

particularly influenced by the Swiss psychologist Jean Piaget, who studied and

theorized about child development for more than half a century. Much of the

recent information-processing work in cognitive development has been concerned

with correcting and restructuring Piaget’s theory of cognitive development.

Despite these revisions, his research has organized a large set of qualitative

observations about cognitive development spanning the period from birth to

adulthood. Therefore, it is worthwhile to review these observations to get a

picture of the general nature of cognitive development during childhood.

According to Piaget, a child enters the world lacking virtually all the basic

cognitive competencies of an adult but gradually develops these competencies

by passing through a series of stages of development. Piaget distinguishes four

major stages. The sensory-motor stage is in the first 2 years of life. In this stage,

children develop schemes for thinking about the physical world—for instance,

they develop the notion of an object as a permanent thing in the world. The

second stage is the preoperational stage, which is characterized as spanning the

period from 2 to 7 years of age. Unlike the younger child, a child in this period

can engage in internal thought about the world, but these mental processes are

intuitive and lack systematicity. For instance, a 4-year-old who was asked to

describe his painting of a farm and some animals said, “First, over here is a

392 | Individual Differences in Cognition

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 392

house where the animals live. I live in a house. So do my mommy and daddy.

This is a horse. I saw horses on TV. Do you have a TV?”

The third stage is the concrete-operational stage, which spans the period

from age 7 to age 11. In this period, children develop a set of mental operations

that allow them to treat the physical world in a systematic way. However, children

still have major limitations on their capacity to reason formally about the

world. The capacity for formal reasoning emerges in Piaget’s fourth period, the

formal-operational stage, spanning the years from 11 to adulthood. Upon entering

this period, although there is still much to learn, a child has become an

adult conceptually and is capable of scientific reasoning—which Piaget takes as

the paradigm case of mature intellectual functioning.

Piaget’s concept of a stage has always been a sore point in developmental psychology.

Obviously, a child does not suddenly change on an 11th birthday from

the stage of concrete operations to the stage of formal operations. There are large

differences among children and cultures, and the ages given are just approximations.

However, careful analysis of the development within a single child also

fails to find abrupt changes at any age. One response to this gradualness has been

to break down the stages into smaller substages. Another response has been to

interpret stages as simply ways of characterizing what is inherently a gradual and

continuous process. Siegler (1996) argued that, on careful analysis, all cognitive

development is continuous and gradual. He characterized the belief that children

progress through discrete stages as “the myth of the immaculate transition.”

Just as important as Piaget’s stage analysis is his analysis of children’s performance

in specific tasks within these stages. These task analyses provide the

empirical substance to back up his broad and abstract characterization of the

stages. Probably his most well-known task analysis is his research on conservation,

considered next.

Piaget proposed that children progress through four stages of increasing intellectual

sophistication: sensory-motor, preoperational, concrete-operational,

and formal-operational.

Conservation

The term conservation most generally refers to knowledge of the properties of

the world that are preserved under various transformations.A child’s understanding

of conservation develops as the child progresses through the Piagetian stages.

Conservation in the sensory-motor stage. A child must come to understand

that objects continue to exist over transformations in time and space. If a cloth is

placed over a toy that a 6-month-old is reaching for, the infant stops reaching

and appears to lose interest in the toy (Figure 14.2). It is as if the object ceases to

exist for the child when it is no longer in view. Piaget concluded from his experiments

that children do not come into the world with this knowledge but rather

develop a concept of object permanence during the first year.

According to Piaget, the concept of object permanence develops slowly and is

one of the major intellectual developments in the sensory-motor stage. An older

Cognitive Development | 393

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 393

infant will search for an object that has been hidden, but more demanding tests

reveal failings in the older infant’s understanding of a permanent object. In one

experiment, an object is put under cover A, and then, in front of the child, it is

removed and put under cover B. The child will often look for the object under

cover A. Piaget argues that the child does not understand that the object will still

be in location B. Only after the age of 12 months can the child succeed consistently

at this task.

Conservation in the preoperational and concrete-operational stages. A

number of important advances in conservation occur at about 6 years of age,

which, according to Piaget, is the transition between the preoperational and the

concrete-operational stages. Before this age, children can be shown to have

some glaring errors in their reasoning. These errors start to correct themselves

at this point. The cause of this change has been controversial, with different

theorists pointing to language (Bruner, 1964) and the advent of schooling

(Cole & D’Andrade, 1982), among other possible causes. Here, we will content

ourselves with a description of the changes leading to a child’s understanding

of conservation of quantity.

As adults, we can almost instantaneously recognize that there are four apples

in a bowl and can confidently know that these apples will remain four when

dumped into a bag. Piaget was interested in how a child develops the concept of

quantity and learns that quantity is something that is preserved under various

transformations, such as moving the objects from a bowl to a bag. Figure 14.3

illustrates a typical conservation problem that has been posed by psychologists

in many variations to preschool children in countless experiments. A child is

presented with two rows of objects, such as checkers. The two rows contain the

same number of objects and have been lined up so as to correspond. The child is

asked whether the two rows have the same amount and responds that they do.

The child can be asked to count the objects in the two rows to confirm that

conclusion. Now, before the child’s eyes, one row is compressed so that it is

shorter than the other row, but no checkers are added or removed. Again asked

394 | Individual Differences in Cognition

FIGURE 14.2 An illustration of a child’s apparent inability to understand the permanence of an

object. (Monkmeyer Press Photo Service, Inc. From Santrock and Yussen, 1989. Reprinted by permission of the publisher.

Copyright © 1989 by Wm. C. Brown.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 394

Cognitive Development | 395

FIGURE 14.3 A typical experimental situation to test for conservation of number. (Monkmeyer

Press Photo Service, Inc. From Santrock and Yussen, 1989. Reprinted by permission of the publisher. Copyright © 1989

Wm. C. Brown.)

which row has more objects, the child now says that the longer row has more.

The child appears not to know that quantity is something that is preserved

under transformations such as the compression of space. If asked to count the

two rows, the child expresses great surprise that they have the same number.

A general feature in demonstrations of lack of conservation is that the

irrelevant physical features of a display distract children. Another example is the

liquid-conservation task, which is illustrated in Figure 14.4. A child is shown two

identical beakers containing identical amounts of water and an empty, tall,

thin beaker. When asked whether the two identical beakers hold the same

amount of water, the child answers “Yes.” The water from one beaker is then

poured into the tall, thin beaker. When asked whether the amount of water in

the two containers is the same, the child now says that the tall beaker holds

more. Young children are distracted by physical appearance and do not relate

their having seen the water poured from one beaker into the other to the

unchanging quantity of liquid. Bruner (1964) demonstrated that a child is less

likely to fail to conserve if the tall beaker is hidden from sight while it is being

filled; then the child does not see the high column of water and so is not distracted

by physical appearance. Thus, it is a case of being overwhelmed by

physical appearance. The child does understand that water preserves its quantity

after being poured.

Failure of conservation has also been shown with weight and volume of solid

objects (for a discussion of studies of conservation, see Brainerd, 1978; Flavell,

1985; Ginsburg & Opper, 1980). It was once thought that the ability to perform

successfully on all these tasks depended on acquiring a single abstract concept of

conservation. Now, however, it is clear that successful conservation appears earlier

on some tasks than on others. For instance, conservation of number usually

appears before conservation of liquid. Additionally, children in transition will

show conservation of number in one experimental situation but not in another.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 395

Conservation in the formal-operational period. When children reach the

formal-operational period, their understanding of conservation reaches new

levels of abstraction. They are able to understand the idealized conservations that

are part of modern science, including concepts such as the conservation of

energy and the conservation of motion. In a frictionless world, an object once set

in motion continues, an abstraction that the child never experiences. However,

the child comes to understand this abstraction and the way in which it relates

to experiences in the real world.

As children develop, they gain increasingly sophisticated understanding

about what properties of objects are conserved under which transformations.

What Develops?

Clearly, as Piaget and others have documented, major intellectual changes take

place in childhood. However, there are serious questions concerning what underlies

these changes. There are two ways of explaining why children perform

better on various intellectual tasks as they get older: One is that they “think

better,” and the other is that they “know better.” The think-better option holds

that children’s basic cognitive processes become better. Perhaps they can hold

396 | Individual Differences in Cognition

(a) (b)

FIGURE 14.4 A typical experimental situation to test for conservation of liquid. (Monkmeyer Press Photo

Service, Inc. From Santrock and Yussen, 1989. Reprinted by permission of the publisher. Copyright © 1989 Wm. C. Brown Publishers.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 396

more information in working memory or process information faster. The knowbetter

option holds that children have learned more facts and better methods as

they get older. I refer to this as “know better,” not “know more,” because it is not

just a matter of adding knowledge but also a matter of eliminating erroneous

facts and inappropriate methods (such as relying on appearance in the conservation

tasks). Perhaps this superior knowledge enables them to perform the tasks

more efficiently. A computer metaphor is apt here: A computer application can

be made to perform better by running the same program on a faster machine

that has more memory or by running a better program on the same machine.

Which is it in the case of child development—better machine or better program?

Rather than the reason being one or the other, the child’s improvement

is due to both factors, but what are their relative contributions? Siegler (1998)

argued that many of the developmental changes that take place in the first 2 years

are to be understood in relation to neural changes. Such changes in the first

2 years are considerable. As we already noted, an infant is born with more

neurons than the child will have at a later age. Although the number of neurons

decreases, the number of synaptic connections increases 10-fold in the first

2 years, as illustrated in Figure 14.5. The number of synapses reaches a peak at

about age 2, after which it declines. The earlier pruning of neurons and the later

Cognitive Development | 397

(a) (b) (c)

II

III

IV

VI

II

III

IV

VI

II

III

IV

VI

FIGURE 14.5 Postnatal development of human cerebral cortex around Broca’s area: (a) newborn;

(b) 3 months; (c) 24 months. (From Lenneberg, 1967.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 397

pruning of synaptic connections can be thought of as a process by which the

brain can fine-tune itself. The initial overproduction guarantees that there will

be enough neurons and synapses to process the required information. When

some neurons or synapses are not used, and so are proved unnecessary, they

wither away (Huttenlocher, 1994). After age 2, there is not much further growth

of neurons or their synaptic connections, but the brain continues to grow

because of the proliferation of other cells. In particular, the glial cells increase,

including those that provide the myelinated sheaths around the axons of

neurons. As discussed in Chapter 1, myelination enables the axon to conduct

brain signals rapidly. The process of myelination continues into the late teens

but at an increasingly gradual pace. The effects of this gradual myelination can

be considerable. For instance, the time for a nerve impulse to cross the hemispheres

in an adult is about 5 milliseconds (ms), which is four to five times as

fast as in a 4-year-old (Salamy, 1978).

It is tempting to emphasize the improvement in processing capacity as the

basis for improvement after age 2. After all, consider the physical difference

between a 2-year-old and an adult. When my son was 2 years old, he had difficulty

mastering the undoing of his pajama buttons. If his muscles and coordination

had so much maturing to do, why not his brain? This analogy, however, does

not hold: A 2-year-old has reached only 20% of his adult body weight, whereas

the brain has already reached 80% of its final size. Cognitive development after

age 2 may depend more on the knowledge that a person puts into his or her

brain rather than on any improvement in the physical capacities of the brain.

Neural development is a more important contributor to cognitive development

before the age of 2 than after.

The Empiricist-Nativist Debate

There is relatively little controversy either about the role that physical development

of the brain plays in the growth of human intellect or about the incredible

importance of knowledge to human intellectual processes. However, there is an

age-old nature-versus-nurture controversy that is related to, but different from, the

issue of physical growth versus knowledge accumulation. This debate is between

the nativists and the empiricists (see Chapter 1) about the origins of that knowledge.

The nativists argue that the most important aspects of our knowledge about

the world appear as part of our genetically programmed development, whereas

the empiricists argue that virtually all knowledge comes from experience with the

environment. One reason that this issue is emotionally charged is that it would

seem tied to conceptions about what makes humans special and what their potential

for change is. The nativist view is that we sell ourselves short if we believe

that our minds are just a simple reflection of our experiences, and empiricists

believe that we undersell the human potential if we think that we are not capable

of fundamental change and improvement. The issue is not this simple, but it

nonetheless fuels great passion on both sides of the debate.

We have already visited this issue in the discussions of language acquisition

and of whether important aspects of human language are innately specified,

398 | Individual Differences in Cognition

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 398

such as language universals. However, similar arguments have been made for our

knowledge of human faces or our knowledge of biological categories. A particularly

interesting case concerns our knowledge of number. Piaget used experiments

such as those on number conservation to argue that we do not have an

innate sense of numbers, but others have used experiments to argue otherwise.

For instance, in studies of infant attention, young children have been shown to

discriminate one object from two and two from three (Antell & Keating, 1983;

Starkey, Spelke, & Gelman, 1990; van Loosbroek & Smitsman, 1992). In these

studies, young children become bored looking at a certain number of objects but

show renewed interest when the number of objects changes. There is even evidence

for a rudimentary ability to add and subtract (Simon, Hespos, & Rochat,

1995;Wynn, 1992). For instance, if a 5-month-old child sees one object appear on

stage and then disappear behind a screen, and then sees a second object appear

on stage and disappear behind the screen, the child is surprised if there are not

two objects when the screen is raised (Figure 14.6—note this contradicts Piaget’s

claims about failure of conservation in the sensory-motor stage). This reaction is

taken as evidence that the child calculates 1 _ 1 _ 2. Dehaene (2000) argued that

a special structure in the parietal cortex is responsible for representing number

and showed that it is especially active in certain numerical judgment tasks.

On the other hand, others argue that most of human knowledge could not

be coded into our genes. This argument was strengthened with the realization

in 2001 that a human has only 30,000 genes—only about one-third the number

originally estimated. Moreover, more than 97% of these genes are generally believed

to be shared with chimpanzees, which does not leave much for encoding

Cognitive Development | 399

1. Object placed in case

Sequence of events: 1+1 = 1 or 2

2. Screen comes up 3. Second object added 4. Hand leaves empty

5. Screen drops...

Then either: (a) Possible Outcome or (b) Impossible Outcome

6. revealing 2 objects 5. Screen drops... 6. revealing 1 object

FIGURE 14.6 In Karen Wynn’s experiment, she showed 5-month-old infants one or two dolls

on a stage. Then she hid the dolls behind a screen and visibly removed or added one. When

she lifted the screen out of the way, the infants would often stare longer when shown a wrong

number of dolls.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 399

the rich knowledge that is uniquely human. Elman et al. (1996), among other

researchers, have argued that, although the basic mechanisms of human information

processing are innately specified and these mechanisms enable human

thought, most of the knowledge and structure of the mind must be acquired

through experience.

There is considerable debate in cognitive science about the degree to which

our basic knowledge is innate or acquired from experience.

Increased Mental Capacity

A number of developmental theories have proposed that there are basic cognitive

capacities that increase from birth through the teenage years (Case, 1985;

Fischer, 1980; Halford, 1982; Pascual-Leone, 1980). These theories are often called

neo-Piagetian theories of development. Consider Case’s memory-space proposal,

which is that a growing working-memory capacity is the key to the developmental

sequence. The basic idea is that more-advanced cognitive performance requires

that more information be held in working memory.

An example of this analysis is Case’s (1978) description of how children solve

Noelting’s (1975) juice problems. A child is given two empty pitchers, A and B,

and is told that several tumblers of orange juice and tumblers of water will be

poured into each pitcher. The child’s task is to predict which pitcher will taste

most strongly of orange juice. Figure 14.7 illustrates four stages of juice problems

that children can solve at various ages. At the youngest age, children can

reliably solve only problems where all orange juice goes into one pitcher and all

water into another. At ages 4 to 5, they can count the number of tumblers of

orange juice going into a pitcher and choose the pitcher that holds the larger

number—not considering the number of tumblers of water. At ages 7 to 8, they

notice whether there is more orange juice or more water going into a pitcher. If

pitcher A has more orange juice than water and pitcher B has more water than

orange juice, they will choose pitcher A even if the absolute number of glasses of

orange juice is fewer. Finally, at age 9 or 10, children compute the difference between

the amount of orange juice and the amount

of water (still not a perfect solution).

Case argued that the working-memory requirements

differ for the various types of problems represented

in Figure 14.7. For the simplest problems,

a child has to keep only one fact in memory—

which set of tumblers has the orange juice. Children

at ages 3 to 4 can keep only one such fact in

mind. If both sets of tumblers have orange juice,

the child cannot solve the problem. For the second

type of problem, a child needs to keep two things

in memory—the number of orange juice tumblers

in each array. In the third type of problem, a child

needs to keep additional partial products in mind

to determine which side has more orange juice

400 | Individual Differences in Cognition

Age

A B

3−4

4−5

7−8

9−0

FIGURE 14.7 The Noelting juice

problem solved by children at

various ages. The problem is

to tell which pitcher will taste

more strongly of orange juice

after participants observe

the tumblers of water and

tumblers of juice that will be

poured into each pitcher.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 400

than water. To solve the fourth type of problem, a child needs four facts to make

a judgment:

1. The absolute difference in tumblers going into pitcher A

2. The sign of the difference for pitcher A (i.e., whether there is more water

or more orange juice going into pitcher)

3. The absolute difference in tumblers going into pitcher B

4. The sign of the difference for pitcher B

Case argued that children’s developmental sequences are controlled by their

working-memory capacity for the problem. Only when they can keep four facts

in memory will they achieve the fourth stage in the developmental sequence.

Case’s theory has been criticized (e.g., Flavell, 1978) because it is hard to decide

how to count the working-memory requirements.

Another question concerns what controls the growth in working memory.

Case argued that a major factor in the increase of working memory is increased

speed of neural function. He cited the evidence that the degree of myelination

increases with age, with spurts approximately at those points where he postulated

major changes in working memory. On the other hand, he also argued

that practice plays a significant role as well: With practice, we learn to perform

our mental operations more efficiently, and so they do not require as much

working-memory capacity.

The research of Kail (1988) can be viewed as consistent with the proposal

that speed of mental operation is critical. This investigator looked at a number

of cognitive tasks, including the mental rotation task examined in Chapter 4.

He presented participants with pairs of letters in different orientations and

asked them to judge whether the letters were the same or were mirror images of

one another. As discussed in Chapter 4, participants tend to mentally rotate an

image of one object into congruence with the other to make this judgment. Kail

observed people, who ranged in age from 8 to 22, performing this task and

found that they became systematically faster with age. He was interested in

rotation rate, which he measured as the number of milliseconds to rotate one

degree of angle. Figure 14.8 shows these data, plotting rate of rotation as a function

of age. The time to rotate a degree of angle decreases

as a function of age.

In some of his writings, Kail argued that this result is

evidence of an increase in basic mental speed as a function

of age. However, an alternative hypothesis is that it reflects

accumulating experience over the years at mental rotation.

Kail and Park (1990) put this hypothesis to the test by giving

11-year-old children and adults more than 3,000 trials

of practice at mental rotation. They found that both groups

sped up but that adults started out faster. However, Kail

and Park showed that all their data could be fit by a single

power function that assumed that the adults came into the

experiment with what amounted to an extra 1,800 trials

of practice (Chapters 6 and 9 showed that learning curves

Cognitive Development | 401

12

Age (years)

Mental rotation

Rotation rate (ms/degree)

16 20

FIGURE 14.8 Rates of mental

rotation, estimated from the

slope of the function relating

response time to the orientation

of the stimulus. (Kail, 1988.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 401

tended to be fit by power functions). Figure 14.9 shows the resulting data,

with the children’s learning function superimposed on the adult’s learning

function. The practice curve for the children assumes that they start with about

150 trials of prior practice, and the practice curve for the adults assumes that they

start with 1,950 trials of prior practice. However, after 3,000 trials of practice,

children are a good bit faster than beginning adults. Thus, although the rate of

information processing increases with development, this increase may have a

practice-related rather than a biological explanation.

Qualitative and quantitative developmental changes take place in cognitive

development because of increases both in working-memory capacity and in

rate of information processing.

Increased Knowledge

Chi (1978) demonstrated that developmental differences may be knowledge

related.Her domain of demonstration was memory. Not surprisingly, children do

worse than adults on almost every memory task. Is their performance because

402 | Individual Differences in Cognition

1,000

2,000 3,000 4,000

Imputed trials of practice

Rotation rate (ms/degree)

Children

Adults

5,000

FIGURE 14.9 Data from Kail and Park (1990): Children and adults are on the same learning

curve, but adults are advanced 1,800 trials.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 402

their memories have less capacity or is it because

they know less about what they are being asked to remember?

To address this question, Chi compared the

memory performance of 10-year-olds with that of

adults on two tasks—a standard digit-span task and

a chess memory task (see the discussion of these

tasks in Chapters 6 and 9). The 10-year-olds were

skilled chess players, whereas the adults were novices

at chess. The chess task was the one discussed in

Chapter 9 on page 258—a chessboard was presented

for 10 s and then withdrawn, and participants were

then asked to reproduce the chess pattern.

Figure 14.10 illustrates the number of chess

pieces recalled by children and adults. It also contrasts

these results with the number of digits recalled

in the digit-span task. As Chi predicted, the adults

were better on the digit-span task, but the children were better on the chess

task. The children’s superior chess performance was attributed to their greater

knowledge of chess. The adults’ superior digit performance was due to their

greater familiarity with digits—the dramatic digit-span performance of participant

S. F. in Chapter 9 shows just how much digit knowledge can lead to

improved memory performance.

The novice-expert contrasts in Chapter 9 are often used to explain developmental

phenomena. We saw that a great deal of experience in a domain is

required if a person is to become an expert. Chi’s argument is that children,

because of their lack of knowledge, are near universal novices, but they can become

more expert than adults through concentrated experience in one domain,

such as chess.

The Chi experiment contrasted child experts with adult novices. Schneider,

Körkel, and Weinert (1988) looked at the effect of expertise at various age

levels. They categorized German schoolchildren as either experts or novices

with respect to soccer. They did so separately for grade levels 3, 5, and 7. The

students at each grade level were asked to recall a story about soccer. Table 14.1

illustrates the amount of recall displayed as a function of grade level and

expertise. The effect of expertise was much greater than that of grade level. On

a recognition test, there was no effect of grade level, only an effect of expertise.

They also classified each group of participants into high-ability

and low-ability participants on the basis of their performance

on intelligence tests. Although such tests generally predict

memory for stories, Schneider et al. found no effect of general

ability level, only of knowledge for soccer. They argue that highability

students are just those who know a lot about a lot of

domains and consequently generally do well on memory tests.

However, when tested on a story about a specific domain such

as soccer, a high-ability student who knows nothing about that

domain will do worse than a low-ability student who knows a

lot about the domain.

Cognitive Development | 403

Children

Number recalled

Digits

Chess

5 pieces

10

Adults

FIGURE 14.10 Number of chess

pieces and number of digits

recalled by children versus

adults. (From Chi, 1978.)

TABLE 14.1

Mean Percentages of Idea Units Recalled

as a Function of Grade and Expertise

Grade Soccer Experts Soccer Novices

3 54 32

5 52 33

7 61 42

From Körkel (1987).

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 403

404 | Individual Differences in Cognition

In addition to lack of relevant knowledge, children have difficulty on memory

tasks because they do not know the strategies that lead to improved memory. The

clearest case concerns rehearsal. If you were asked to dial a novel seven-digit telephone

number, I would hope that you would rehearse it until you were confident

that you had it memorized or until you had dialed the number. It would not

occur to young children that they should rehearse the number. In one study comparing

5-year-olds with 10-year-olds, Keeney, Cannizzo, and Flavell (1967) found

that 10-year-olds almost always verbally rehearsed a set of objects to be remembered,

whereas 5-year-olds seldom did. Young children’s performance often improves

if they are instructed to follow a verbal rehearsal strategy, although very

young children are simply unable to execute such a rehearsal strategy.

Chapter 6 emphasized the importance of elaborative strategies for good

memory performance. Particularly for long-term retention, elaboration appears

to be much more effective than rote rehearsal. There also appear to be sharp

developmental trends with respect to the use of elaborative encoding strategies.

For instance, Paris and Lindauer (1976) looked at the elaborations that children

use to relate two paired-associates nouns such as lady and broom. Older children

are more likely to generate interactive sentences such as The lady flew

on the broom than static sentences such as The lady had a broom. Such interactive

sentences will lead to better memory performance. Young children are also

poorer at drawing the inferences that improve memory for a story (Stein &

Trabasso, 1981).

Younger children often do worse on tasks than do older children, because they

have less relevant knowledge and poorer strategies.

Cognition and Aging

Changes in cognition do not cease when we reach adulthood. As we get older,

we continue to learn more things, but human cognitive ability does not uniformly

increase with added years, as we might expect if intelligence were only a

matter of what one knows. Figure 14.11 shows data compiled by Salthouse

(1992) on two components of the Wechsler Adult Intelligence Scale-Revised

(WAIS-R). One component deals with verbal intelligence, which includes elements

such as vocabulary and language comprehension. As you can see, this

component maintains itself quite constantly through the years. In contrast, the

performance component, which includes abilities such as reasoning and problem

solving, decreases dramatically.

The importance of these declines in basic measures of cognitive ability can

be easily exaggerated. Such tests are typically given rapidly, and older adults do

better on slower tests. Additionally, such tests tend to be like school tests, and

young adults have had more recent experience with such tests. When it comes

to relevant job-related behavior, older adults often do better than younger

adults (e.g., Perlmutter, Kaplan, & Nyquist, 1990), owing both to their greater

accumulation of knowledge and to a more mature approach to job demands.

There is also evidence that previous generations did not do as well on tests even

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 404

when they were young. This is the so-call “Flynn effect”—IQ scores appear to

have risen about 3 points per decade (Flynn, 1987). The comparisons in Figure

14.11 are not only of people of different ages but also of people who grew up in

different periods. Some of the apparent decline in the figure might be due to

differences among generations (education, nutrition, etc.) and not age-related

factors.

Although non-age-related factors may explain some of the decline shown in

Figure 14.9, there are substantial age-related declines in brain function. Brain

cells gradually die. Some areas are particularly susceptible to cell death. The

hippocampus, which is particularly important to memory (see Chapter 7),

loses about 5% of its cells every decade (Selkoe, 1992). Other cells, though they

might not die, have been observed to shrink and atrophy. On the other hand,

there is some evidence for compensatory growth: Cells remaining in the

hippocampus will grow to compensate for the age-related deaths of their neighbors.

There is also increasing evidence for the birth of new neurons, particularly

in the region of the hippocampus (E. Gould & Gross, 2002).Moreover, the

number of new neurons seems to be very much related to the richness of a person’s

experience. Although these new neurons are few in number compared

with the number lost, they may be very valuable because new neurons are more

plastic and may be critical to encoding new experiences.

Although there are age-related neural losses, they may be relatively minor

in most intellectually active adults. The real problem concerns the intellectual

deficits associated with various brain-related disorders. The most common

of these disorders is Alzheimer’s disease, which is associated with substantial

Cognitive Development | 405

Chronological age (years)

20 30 40 50 60 70 80

70

80

90

100

110

120

130

Mean IQ

Verbal

Performance

FIGURE 14.11 Mean verbal and performance IQs from the WAIS-R standardization sample as

a function of age. (From Salthouse, 1992. Reprinted by permission from LEA, Inc.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 405

impairment of brain function, particularly in the

temporal region including the hippocampus. Many

of these diseases progress slowly, and some of the

reason for age-related deficits in tests such as that of

Figure 14.11 may be due to the fact that some of the

older participants are suffering from the early stages

of such diseases. However, even when health factors

are taken into account and when the performance

of the same participants is tracked in longitudinal

studies (so there is not a generational confound),

there is evidence for age-related intellectual decline,

although it may not become significant until after

age 60 (Schaie, 1996).

As we get older, there is a race going on between

growth in knowledge and loss of neural function.

People in many professions (artists, scientists, philosophers)

tend to produce their best work in their midthirties.

Figure 14.12 shows some interesting data

from Lehman (1953). He examined the works of 182

famous deceased philosophers who collectively wrote

some 1,785 books. Figure 14.12 plots the probability that a book was considered

that philosopher’s best book as a function of the age at which it was

written. These philosophers remained prolific, publishing many books in their

seventies. However, as Figure 14.12 displays, a book written in this decade is

unlikely to be considered a philosopher’s best.1 Lehman reviewed data from

a number of fields consistent with the hypothesis that the thirties tend to be

the time of peak intellectual performance. However, as Figure 14.12 shows,

people often maintain relatively high intellectual performance into their forties

and fifties.

The evidence for an age-related correlation between brain function and

cognition makes it clear that there is a contribution of biology to intelligence

that knowledge cannot always overcome. Salthouse (1992) argued that, in

information-processing terms, people lose their ability to hold information in

working memory with age. He contrasted participants of different ages on the

reasoning problems presented in Figure 14.13. These problems differ in the

number of premises that need to be combined to come to a particular solution.

Figure 14.13 shows how people at various ages perform in these tasks. As can be

seen, people’s ability to solve these problems generally declines with the number

of premises that need to be combined. However, this drop-off is much

steeper for older adults. Salthouse argued that older adults are slower than

younger adults in information processing, which inhibits their ability to maintain

information in working memory.

406 | Individual Differences in Cognition

70 80

.15

.10

.05

20 30 40 50

Decade

Probability of best book

60

.00

FIGURE 14.12 Probability that

a particular book will become a

philosopher’s best as a function

of the age at which the philosopher

wrote the book. (Adapted from

Lehman, 1953.)

1 It is important to note that this graph denotes the probability of a specific book being the best, and so the

outcome is not an artifact of the number of books written during a decade.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 406

Cognitive Development | 407

Twenties

Sixties

Forties

Number of premises

Percentage correct

1 2 3 4 5

50

60

70

80

90

100

Q and R do the OPPOSITE

If Q INCREASES, what will happen to R?

D and E do the OPPOSITE

C and D do the SAME

If C INCREASES, what will happen to E?

R and S do the SAME

Q and R do the OPPOSITE

S and T do the OPPOSITE

If Q INCREASES, what will happen to T?

U and V do the OPPOSITE

W and X do the SAME

T and U do the SAME

V and W do the OPPOSITE

If T INCREASES, what will happen to X?

FIGURE 14.13 Illustration of integrative reasoning trials hypothesized to vary in working-memory

demands (top), and mean performance of adults in their twenties, forties, and sixties with each

trial type (bottom).

Increased knowledge and maturity sometimes compensate for age-related

declines in rates of information processing.

Summary for Cognitive Development

With respect to the nature-versus-nurture issue, the developmental data paint a

mixed picture. A person’s brain is probably at its best physically in the early twenties,

and intellectual capacity tends to follow brain function. The relation seems

particularly strong in the early years of childhood. However, we saw evidence that

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 407

practice could overcome age-related differences in speed (Figure 14.9), and knowledge

could be a more dominant factor than age (Figure 14.10 and Table 14.1).

Additionally, the point of peak intellectual output appears to take place later

than in a person’s twenties (Figure 14.12), indicating the need for accumulated

knowledge. As discussed in Chapter 9, truly exceptional performance in a field

tends to require at least 10 years of experience in that field.

•Psychometric Studies of Cognition

We now turn from considering how cognition varies as a function of age to

considering how cognition varies within a population of a fixed age. All this

research has basically the same character. It entails measuring the performances

of various people on a number of tasks and then looking at the way in which

these performance measures correlate across different tests. Such tests are

referred to as psychometric tests. This research has established that there is not

a single dimension of “intelligence” on which people vary but rather that individual

differences in cognition are much more complex. We will first examine

research on intelligence tests.

Intelligence Tests

Research on intelligence testing has had a much longer sustained intellectual

history than cognitive psychology. In 1904, the Minister of Public Instruction in

Paris named a commission charged with identifying children in need of remedial

education. Alfred Binet set about developing a test that would objectively

identify students having intellectual difficulty. In 1916, Lewis Terman adapted

Binet’s test for use with American students. His efforts led to the development

of the Stanford-Binet, a major general intelligence test in use in America today

(Terman & Merrill, 1973). The other major intelligence test used in America is

the Wechsler, which has separate scales for children and adults. These tests include

measures of digit span, vocabulary, analogical reasoning, spatial judgment,

and arithmetic. A typical question for adults on the Stanford-Binet is,

“Which direction would you have to face so your right hand would be to the

north?” A great deal of effort goes into selecting test items that will predict

scholastic performance.

Both of these tests produce measures that are called intelligence quotients

(IQs). The original definition of IQ relates mental age to chronological age. The

test establishes one’s mental age. If a child can solve problems on the test that

the average 8-year-old can solve, then the child has a mental age of 8 independent

of chronological age. IQ is defined as the ratio of mental age to chronological

age multiplied by 100 or

IQ _ 100 _ MA/CA

where MA is mental age and CA is chronological age. Thus, if a child’s mental

age were 6 and chronological age were 5, the IQ would be 100 _ 6/5 _ 120.

408 | Individual Differences in Cognition

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 408

This definition of IQ proved unsuitable for a number of reasons. It cannot

extend to measurement of adult intelligence, because performance on intelligence

tests starts to level off in the late teens and declines in later years. To deal

with such difficulties, the common way of defining IQ now is in terms of deviation

scores. A person’s raw score is subtracted from the mean score for that person’s

age group, and then this difference is transformed into a measure that will

vary around 100, roughly as the earlier IQ scores would. The precise definition

is expressed as

(score _ mean)

IQ _ 100 _ 15 _ ________________

standard deviation

where standard deviation is a measure of the variability of the scores. IQs so

measured tend to be distributed according to a normal distribution. Figure 14.14

shows such a normal distribution of intelligence scores and the percentage of

people who have scores in various ranges.

Whereas the Stanford-Binet and the Weschler are general intelligence

tests, many others were developed to test specialized abilities, such as spatial

ability. These tests partly owe their continued use in the United States to the

fact that they do predict performance in school with some accuracy, which

was one of Binet’s original goals. However, their use for this purpose is

controversial. In particular, because such tests can be used to determine

who can have access to what educational opportunities, there is a great deal

of concern that they should be constructed so as to prevent biases against

certain cultural groups. Immigrants often do poorly on tests of intelligence

because of cultural biases on the tests. For instance, immigrant Italians of

less than a century ago scored an average of 87 on IQ tests (Sarason &

Doris, 1979), whereas today their descendants have slightly above average IQs

(Ceci, 1991).

Psychometric Studies of Cognition | 409

70

2% 11% 11% 2%

33% 33%

85 100 115 130

FIGURE 14.14 A normal distribution of IQ measures.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 409

The very concept of intelligence is culturally relative. What one culture

values as intelligent another culture will not. For instance, the Kpelle, an

African culture, think that the way in which Westerners sort instances into

categories (for instance, sorting apples and oranges into the same category—a

basis for some items in intelligence tests) is foolish (Cole, Gay, Glick, & Sharp,

1971). Robert Sternberg (personal communication) notes that some cultures

do not even have a word for intelligence. Still, the fact remains that intelligence

tests do predict performance in our (Western) schools. Whether they

are doing a valuable service in assessing students for schools or are simply

enforcing arbitrary cultural beliefs about what is to be valued is a difficult

question.

Related to the issue of the fairness of intelligence tests is whether they

measure innate endowment or acquired ability (the nature-versus-nurture

issue again). Potentially definitive data would seem to come from studies of

identical twins reared apart. Sometimes such twins have been adopted into different

families—they have identical genetic endowment but different environmental

experiences. The research on this topic is controversial (Kamin, 1974),

but analyses (Bouchard, 1983; Bouchard & McGue, 1981) indicate that identical

twins raised apart have IQs much more similar to each other than do

nonidentical fraternal twins raised in the same family. This evidence seems

to indicate the existence of a strong innate component of IQ. Yet drawing the

conclusion that intelligence is largely innate would be a mistake. Intelligence and

IQ are by no means the same thing. Because of the goals of intelligence tests,

they must predict success across a range of environments, particularly academic.

Thus they must discount the contributions of specific experiences to

intelligence. As noted in Chapter 9, for instance, chess masters tend not to have

particularly high IQs. This tendency is more a comment on the IQ test than on

chess masters. If an IQ test focused on chess experience, it would have little

success in predicting academic success generally. Thus, intelligence tests try

to measure raw abilities and general knowledge that are reasonably expected

of everyone in a culture. However, as we saw in Chapter 9, excellence in any

specific domain depends on knowledge and experience that are not general in

the culture.

Another interesting demonstration of this lack of correlation between

expertise and IQ was performed by Ceci and Liker (1986). These researchers

looked at the ability of avid horse-racing fans to handicap races. They

found that handicapping skill is related to the development of a complex

interactive model of horse racing but that there was no relation between this

skill and IQ.

Although specific experience is clearly important to success in any field, the

remarkable fact is that these intelligence tests are able to predict success in certain

endeavors. They predict with modest accuracy both performance in school

and general success in life (or at least in Western societies).What is it about the

mind that they are measuring? Much of the theoretical work in the field has

been concerned with trying to answer this question. To understand how this

question has been pursued, one must understand a little about a major method

of the field, factor analysis.

410 | Individual Differences in Cognition

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 410

Psychometric Studies of Cognition | 411

scoring members of our society have more limited opportunities

and often are sorted by their test scores into

environments where there is more antisocial behavior.

Another confounding factor is that success in society

is at every point determined by judgments of

other members of the society.

For instance, most studies of job

performance use measures like

ratings of supervisors rather than

actual measures of job performance.

Promotions are often largely

dependent on judgments of superiors.

Also, legal resolutions such

as sentencing decisions in criminal

cases have strong judgmental

aspects to them. It could be that

IQ more strongly affects these

social judgments than the actual

performances being judged such as how well one does

one’s job or how bad a particular activity was. Individuals

in positions of power, such as judges and supervisors,

tend to have high IQs. Thus, there is the possibility that

some of the success associated with high IQ is an ingroup

effect where high-IQ people favor people who are

like them.

Implications

Does IQ determine success in life?

IQ appears to have a strong predictive relationship to

many socially relevant factors besides academic performance.

The American Psychological Report Intelligence:

Knowns and Unknowns (Neisser et al., 1996) states that

IQ accounts for about one-fifth of the variance (positive

correlations in the range of .3 to .5)

in factors like job performance and

income. It has an even stronger relationship

to socioeconomic status.

There are weaker negative correlations

with antisocial measures like

criminal activity.

There is a natural tendency to

infer from this that IQ is directly

related to being a successful member

of our society, but there are

reasons to question a direct relationship.

Access to various educational

opportunities and to some jobs depends on test

scores. Access to other professions depends on completing

various educational programs, the access to which is

partly determined by test scores. Given the strong relationship

between IQ and these test scores, we would

expect that higher-IQ members of our society would get

better training and professional opportunities. Lower-

Standard intelligence tests measure general factors that predict success

in school.

Factor Analysis

The general intelligence tests contain a number of subtests that measure individual

abilities. As already noted, many specialized tests also are available for

measuring particular abilities. The basic observation is that people who do well

on one test or subtest tend to do well on another test or subtest. The degree to

which people perform comparably on two subtests is measured by a correlation

coefficient. If all the same people who did well on one test did just as well on

another, the correlation between the two tests would be 1. If all the people who

did well on one test did proportionately badly on another, the correlation coefficient

would be _1. If there were no relation between how people did on one

test and how they did on another test, the correlation coefficient would be zero.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 411

Typical correlations between tests are positive, but not 1, indicating a less than

perfect relation between performance on one test and on another.

For example, Hunt (1985) looked at the relations among the seven tests described

in Table 14.2. Table 14.3 shows the intercorrelations among these test

scores. As can be seen, some pairs of tests are more correlated than others. For

instance, there is a relatively high (.67) correlation between reading comprehension

and vocabulary but a relatively low (.14) correlation between reading

comprehension and spatial reasoning. Factor analysis is a way of trying to

make sense of these correlational patterns. The basic idea is to try to arrange

these tests in a multidimensional space such that the distances among the tests

correspond to their correlation. Tests close together will have high correlations

and so measure the same thing. Figure 14.15 shows an attempt to organize the

tests in Table 14.2 into a two-dimensional area. The reader can confirm that

412 | Individual Differences in Cognition

TABLE 14.2

Description of Some of the Tests on the Washington Pre-College Test Battery

Test Name Description

1. Reading comprehension Answer questions about paragraph

2. Vocabulary Choose synonyms for a word

3. Grammar Identify correct and poor usage

4. Quantitative skills Read word problems and decide whether problem

can be solved

5. Mechanical reasoning Examine a diagram and answer questions about it;

requires knowledge of physical and mechanical

principles

6. Spatial reasoning Indicate how two-dimensional figures will appear

if they are folded through a third dimension

7. Mathematics achievement A test of high school algebra

From Hunt (1985).

TABLE 14.3

Intercorrelations Between Results of the Tests Listed in Table 14.2

Test No. 1 2 3 4 5 6 7

1 1.00 .67 .63 .40 .33 .14 .34

2 1.00 .59 .29 .46 .19 .31

3 1.00 .41 .34 .20 .46

4 1.00 .39 .46 .62

5 1.00 .47 .39

6 1.00 .46

7 1.00

From Hunt (1985).

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 412

the closer the tests are in this space, the higher

their correlation in Table 14.3.

An interesting question is how to make sense

of this space. As we go from the bottom to the

top, the tests become increasingly symbolic and

linguistic. We might refer to this dimension as a

linguistic factor. Second, we might argue that, as

we go from the left to the right, the tests become

more computational in character.We might consider

this dimension a reasoning factor. High

correlations can be explained in terms of students

having similar values of these factors.

Thus, there is a high correlation between quantitative

skills and mathematics achievement because

they both have an intermediate degree of

linguistic involvement and require substantial

reasoning. People who have strong reasoning

ability and average or better verbal ability will

tend to do well on these tests.

Factor analysis is basically an effort to go from a set of intercorrelations like

those in Table 14.3 to a small set of factors or dimensions that explain those

intercorrelations. There has been considerable debate about what the underlying

factors are. Perhaps you can see other ways to explain the correlations in

Table 14.3. For instance, you might argue that a linguistic factor links tests 1

through 3, a reasoning factor links tests 4, 5, and 7, and there is a separate spatial

factor for test 6. Indeed, we will see that there have been many proposals

for separate linguistic, reasoning, and spatial factors, although, as shown by the

data in Table 14.3, it is a little difficult to separate the spatial and reasoning

factors.

The difficulty in interpreting such data is manifested in the wide variety of

positions that have been taken about what the underlying factors of human

intelligence are. Spearman (1904) argued that only one general factor underlies

performance across tests. He called his factor g. In contrast, Thurstone (1938)

argued that there are a number of separate factors, including verbal, spatial, and

reasoning. Guilford (1982) proposed no less than 120 distinct intellectual abilities.

Cattell (1963) proposed a distinction between fluid and crystallized intelligence;

crystallized intelligence refers to acquired knowledge, whereas fluid

intelligence refers to the ability to reason or to solve problems in novel domains.

(In Figure 14.11, fluid intelligence, not crystallized intelligence, shows the agerelated

decay.) Horn (1968), elaborating on Cattell’s theory, argued that there is

a spatial intelligence that can be separated from fluid intelligence. Table 14.3 can

be interpreted in terms of the Horn-Cattell theory, where crystallized intelligence

maps into the linguistic factor (tests 1 to 3), fluid intelligence into the reasoning

factor (tests 4, 5, and 7), and spatial intelligence into the spatial factor

(test 6). Fluid intelligence tends to be tapped strongly in mathematical tests, but

it is probably better referred to as a reasoning ability rather than a mathematical

Psychometric Studies of Cognition | 413

1. Reading comprehension

2. Vocabulary

3. Grammar

5. Mechanical

reasoning

4. Quantitative skills

7. Mathematics achievement

6. Spatial reasoning

FIGURE 14.15 A twodimensional

representation

of the tests in Table 14.2.

The distance between points

decreases with increases in the

intercorrelations in Table 14.3.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 413

ability. It is a bit difficult to separate the fluid and spatial intelligences in factor

analytical studies, but it appears possible (Horn & Stankov, 1982).

Although it is hard to draw any firm conclusions about what the real factors

are, it seems clear that there is some differentiation in human intelligence as measured

by intelligence tests. Probably, the Horn-Cattell theory or the Thurstone

theory offer the best analyses, producing what we will call a verbal factor, a spatial

factor, and a reasoning factor. The rest of this chapter will provide further

evidence for the division of the human intellect into these three abilities. This

conclusion is significant because it indicates that there is some specialization in

achieving human cognitive function.

In a survey of virtually all data sets, Carroll (1993) proposed a theory of

intelligence that combines the Horn-Cattell and Thurstone perspectives. He

proposed what he called a three-strata theory. At the lowest stratum are specific

abilities such as the ability to be a physicist. Such abilities Carroll thinks are

largely not inheritable. At the next stratum are broader abilities such as the

verbal factor (crystallized intelligence), the reasoning factor (fluid intelligence),

and the spatial factor. Finally, Carroll noted that these factors tend to correlate

together to define something like Spearman’s g at the highest stratum.

In the past few decades, there has been considerable interest in the way in

which these measures of individual differences relate to the kinds of theories of

information processing that are found in cognitive psychology. For instance,

how do participants with high spatial abilities differ from those with low spatial

abilities in the processes entailed in the spatial imagery tasks discussed in

Chapter 4? Makers of intelligence tests have tended to ignore such questions

because their major goal is to predict scholastic performance. We will look at

some information-processing studies that try to understand the reasoning factor,

the verbal factor, and the spatial factor.

Factor-analysis methods identify that a reasoning ability, a verbal ability,

and a spatial ability underlie performance on various intelligence tests.

Reasoning Ability

Typical tests used to measure reasoning include mathematical problems, analogy

problems, series extrapolation problems, deductive syllogisms, and problemsolving

tasks. These tasks are the kinds analyzed in great detail in Chapters 8

through 10. In the context of this book, such abilities might better be called

problem-solving abilities.Most of the research in psychometric tests has focused

only on whether a person gets a question right or not. In contrast, informationprocessing

analyses try to examine the steps by which a person decides on an

answer to such a question and the time necessary to perform each step.

The research of Sternberg (1977; Sternberg & Gardner, 1983) is an attempt to

connect the psychometric research tradition with the information-processing

tradition. He analyzed how people process a wide variety of reasoning problems.

Figure 14.16 illustrates one of his analogy problems. Participants were

asked to solve the analogy “A is to B as C is to D1 or D2?” Sternberg analyzed the

process of making such analogies into a number of stages. Two critical stages in

414 | Individual Differences in Cognition

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 414

his analysis are called reasoning and comparison. Reasoning requires finding

each feature that changes between A and B and applying it to C. In Figure 14.16,

A and B differ by a change in costume from spotted to striped. Thus, one predicts

that C will change from spotted to striped to yield D. Comparison requires

comparing the two choices, D1 and D2; D1 and D2 are compared feature by

feature until a feature is found that enables a choice. Thus, a participant may

first check that both D1 and D2 have an umbrella (which they do), then that

both wear a striped suit (which they do), and then that both have a dark hat

(which only D1 has). The dark hat feature will allow the participant to reject D2

and accept D1.

Sternberg was interested in the time that participants needed to make these

judgments. He theorized that they would take a certain amount longer for each

feature in which A differed from B because this feature would have to be

changed to derive D from C. Sternberg and Gardner (1983) estimated a time of

0.28 s for each such feature. This length of time is the reasoning parameter. They

also estimated 0.60 seconds to compare a feature predicted of D with the features

of D1 and D2. This length of time is the comparison parameter. The values

0.28 and 0.60 are just averages; the actual values of these reasoning and comparison

times varied across participants. Sternberg and Gardner looked at the

correlations between the values of these parameters for individual participants

and the psychometric measures of participants’ reasoning abilities. They found

a correlation of .79 between the reasoning parameter and a psychometric measure

of reasoning and a correlation of .75 between the comparison parameter

and the psychometric measure. These correlations mean that participants who

are slow in reasoning or comparison do poorly in psychometric tests of reasoning.

Thus, Sternberg and Gardner were able to show that measures of speed

Psychometric Studies of Cognition | 415

(a)

(c) D1 D2

(b)

FIGURE 14.16 An example of an analogy problem used by Sternberg and Gardner (1983).

(Copyright © 1983 by the APA. Adapted by permission.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 415

identified in an information-processing analysis are critical to psychometric

measures of intelligence.

Participants who score high on reasoning ability are able to perform individual

steps of reasoning rapidly.

Verbal Ability

Probably the most robust factor to emerge from intelligence tests is the verbal

factor. There has been considerable interest in determining what processes distinguish

people with strong verbal abilities. Goldberg, Schwartz, and Stewart

(1977) compared people with high verbal ability those with low verbal ability

with respect to the way in which they make various kinds of word judgments.

One kind of word judgment concerned simply whether pairs of words were

identical. Thus, participants would say yes to a pair such as

• bear, bear

Other participants were asked to judge whether pairs of words sounded alike.

Thus, they would say yes to a pair such as • bare, bear

A third group of participants were asked to judge whether pairs of words were

in the same category. Thus, they would say yes to a pair such as • lion, bear

Figure 14.17 shows the difference in time taken to make these three judgments

between participants with high verbal abilities and those with low verbal abilities.

As can be seen, participants with high verbal ability enjoy only a small

advantage on the identity judgments but show much larger advantages on the

sound and meaning matches. This study and others (e.g., Hunt, Davidson, &

Lansman, 1981) have convinced researchers that a major advantage of participants

with high verbal ability is the speed with which they can go from a linguistic

stimulus to information about it—in the study depicted in Figure 14.17

participants were going from the visual word to information about its sound

and meaning. Thus, as in the Sternberg studies in the preceding subsection,

speed of processing is related to intellectual ability.

There is also evidence for a fairly strong relation between working-memory

capacity for linguistic material and verbal ability. Daneman and Carpenter

(1980) developed the following test of individual differences in workingmemory

capacity. Participants would read or hear a number of unrelated sentences

such as • When at last his eyes opened, there was no gleam of triumph, no shade

of anger. • The taxi turned up Michigan Avenue where they had a clear view of the

lake.

After reading or hearing these sentences, participants had to recall the last

word of each sentence. They were tested on groups ranging from two to seven

416 | Individual Differences in Cognition

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 416

such sentences. The largest group of sentences for which they could recall the

last words was defined as the reading span or listening span. College students

had spans from 2 to 5.5 sentences. It turns out that these spans are very strongly

related to their comprehension scores and to tests of verbal ability. These reading

and listening spans are much more strongly related than are measures of

simple digit span. Daneman and Carpenter argued that a larger reading and

listening span indicates the ability to store a larger part of the text during

comprehension.

People of high verbal ability are able to rapidly retrieve meanings of words

and have large working memories for verbal information.

Spatial Ability

Efforts have been made to relate measures of spatial ability to research on

mental rotation, such as that discussed in Chapter 4. Just and Carpenter (1985)

Psychometric Studies of Cognition | 417

High verbal

Identity

700

800

900

1,000

1,100

1,200

1,300

Sound

Type of similarity

Meaning

Response time (m/s)

Low verbal

FIGURE 14.17 Response time of participants having high verbal abilities compared with those

having low verbal abilities in judging the similarity of pairs of words as a function of three types

of similarity. (From Goldberg, Schwartz, & Stewart, 1977.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 417

compared participants with low spatial ability and those with high spatial ability

performing the Shepard and Metzler mental rotation tasks (see Figure 4.4).

Figure 14.18 plots the speed with which these two types of participants can rotate

figures of differing angular disparity. As can be seen, participants with low spatial

ability not only performed the task more slowly but were also more affected

by angle of disparity. Thus the rate of mental rotation is lower for participants

with low spatial ability.

Spatial ability has often been set in contrast with verbal ability. Although

some people rate high on both abilities or low on both, interest often focuses

on people who display a relative imbalance of the abilities. MacLeod, Hunt,

and Matthews (1978) found evidence that these different types of people will

solve a cognitive task differently. They looked at performance on the Clark and

Chase sentence-verification task considered in Chapter 13. Recall that, in this

task, participants are presented with sentences such as The plus is above the star

or The star is not above the plus and asked to determine whether the sentence accurately

describes the picture. Typically, participants are slower when there is a

negative such as not in the sentence and when the supposition of the sentences

mismatches the picture.

MacLeod et al. speculated, however, that there were really two groups of

participants—those who took a representation of the sentence and matched it

against a picture and those who first converted the sentence into an image of a

picture and then matched that image against the picture. They speculated that

418 | Individual Differences in Cognition

2,000

4,000

6,000

8,000

10,000

12,000

14,000

16,000

30 60

Angular disparity (degrees)

Low spatial

High spatial

Response time (m/s)

90 120 150 180

FIGURE 14.18 Mean time taken to determine that two objects

have the same three-dimensional shape as a function of the angular

difference in their portrayed orientations. Separate functions are

plotted for participants with high spatial ability and those with low

spatial ability. (From Just & Carpenter, 1985. Copyright © 1985 by the APA. Adapted

by permission.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 418

the first group would be high in verbal ability, whereas

the second group would be high in spatial ability. In

fact, they did find two groups of participants. Figure 14.19

shows the judgment times of these two groups as a function

of whether the sentence was true and whether it

contained a negative. As can be seen, one group of participants

showed no effect of whether the sentence contained

a negative, whereas the other group showed a very

substantial effect. The group of participants not showing

the effect of a negative had higher scores on tests of

spatial ability than those of the other group. The group

not showing the effect was the group of participants who

compared an image formed from the sentence against

the picture. Such an image would not have a negative

in it.

Reichle, Carpenter, and Just (2000) performed an fMRI

brain-imaging study of the regions activated in these two

strategies. They explicitly instructed participants to use

either an imagery strategy or a verbal strategy to solve these

problems. The participants instructed to use the imagery

strategy were told:

Carefully read each sentence and form a mental picture of

the objects in the sentence and their arrangement. . . . After

the picture appears, compare the picture to your mental

image. (p. 268)

On the other hand, participants told to use the verbal

strategy were told:

Don’t try to form a mental image of the objects in the sentence, but instead

look at the sentence only long enough to remember it until the picture is

presented. . . . After the picture appears, decide whether or not the sentence

that you are remembering describes the picture. (p. 268)

They found that parietal regions associated with mental imagery tended to be

activated in participants who were told to use the imagery strategy (see Figure

4.1), whereas regions associated with verbal processing tended to be activated

in participants given the verbal strategy (see Figure 11.1). Interestingly,

when told to use the imagery strategy, participants who had lower imagery

ability showed greater activation in their imagery areas. Conversely, when told

to use the verbal strategy, participants with lower verbal ability tended to show

greater activation in their verbal regions. Thus, participants apparently have to

engage in more neural effort when they are required to use their less favored

strategy.

People with high spatial ability can perform elementary spatial operations

quite rapidly and often choose to solve a task spatially rather than verbally.

Psychometric Studies of Cognition | 419

True

affirmative

500

600

700

800

900

1,000

1,100

1,200

1,300

1,400

1,500

1,600

False

affirmative

Sentence difficulty

Mean verification time (m/s)

High-spatial participants

High-verbal participants

False

negative

True

negative

FIGURE 14.19 Mean time taken to judge a sentence as

a function of sentence type for participants with high

verbal ability compared with those with high spatial ability.

(From MacLeod, Hunt, & Mathews, 1978.)

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 419

420 | Individual Differences in Cognition

Conclusions from Psychometric Studies

A major outcome of the research relating psychometric measures to cognitive

tasks is to reinforce the distinction between verbal and spatial ability. A second

conclusion of this research is that differences in an ability (reasoning, linguistic,

or spatial) may result from differences in rates of processing and workingmemory

capacities. A number of researchers (e.g., Salthouse, 1992; Just &

Carpenter, 1992) have argued that the working-memory differences may result

from differences in processing speed, in that people can maintain more information

in working memory when they can process it more rapidly.

As already mentioned, Reichle et al. (2000) suggested that more-able participants

can solve problems with less expenditure of effort. An early study

confirming this general relation was performed by Haier et al. (1988). These

researchers looked at PET recordings taken during an abstract-reasoning task.

They found that the better-performing participants showed less PET activity,

again indicating that poorer-performing participants have to work harder at the

same task. Like the information-processing work pointing to processing speed,

this finding suggests that differences in intelligence may correspond to very

basic processes. There is a tendency to see such results as favoring a nativist

view, but in fact they are neutral to the nature-versus-nurture controversy.

Some people may take longer and may need to expand more effort to solve a

problem, either because they have practiced less or because they have inherently

less efficient neural structures.We saw earlier in the chapter that, with practice,

children could become faster than adults at processes such as mental rotation.

Figure 9.1 illustrated how the activity of the brain decreases as participants

become more practiced and faster at a task.

Individual differences in general factors such as verbal, reasoning, and

spatial abilities appear to correspond to the speed and ease with which

basic cognitive processes are performed.

•Conclusions

This concludes our consideration of human intelligence (this chapter) and

human cognition (this book). A recurring theme throughout the book has been

the diversity of the components of the mind. The first chapter reviewed evidence

for different specializations in the nervous system. The early chapters reviewed

the evidence for different levels of processing as the information entered the system.

The different types of knowledge representation and the distinction between

procedural and declarative knowledge were presented. Then, we considered the

distinct status of language.Many of these distinctions have been reinforced in this

chapter on individual differences. Throughout this book, different brain regions

have been shown to be specialized to perform different functions.

A second dimension of discussion has been rate of processing. Latency data

have been the most frequently used measure of cognitive functioning in this

book. Often, error measures (the second most common dependent measure) were

shown to be merely indications of slow processing.We have seen evidence in this

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 420

Key Terms | 421

1. Chapter 12 discussed data on child language acquisition.

In learning a second language, younger children

initially learn less rapidly, but there is evidence that

they achieve higher levels of mastery. Discusses this

phenomenon from the point of view of this chapter.

Consider in particular Figure 12.8.

2. Most American presidents are between the ages of

50 and 59 when they are first elected as president. The

youngest elected president was Kennedy (43 when he

was first elected) and the oldest was Reagan (69 when

he was first elected). The 2008 presidential election

featured a contest between a 47-year-old Obama and a

72-year-old McCain.What are the implications of this

chapter for an ideal age for an American president?

3. J. E. Hunter and R. F. Hunter (1984) report that

ability measures like IQ are better predictors of job

performance than academic grades.Why might be this

be so? A potentially relevant fact is that the most commonly

used measure of job performance is supervisor

ratings.

4. The chapter reviewed a series of results indicating

that higher ability people tended to perform basic

information processing steps in less time. There is also

a relationship between ability and perceived time it

takes to perform a demanding task (Fink & Neubauer,

2005). Generally, the more difficult an intellectual task

we perform, the more we tend to underestimate how

long it took. Higher ability people tend to have more

realistic estimates of the passage of time (i.e., they

underestimate less).Why might they underestimate

time less? How could this be related to the fact that

they perform the task more rapidly?

Questions for Thought

Key Terms

concrete-operational

stage

conservation

crystallized intelligence

factor analysis

fluid intelligence

formal-operational stage

intelligence quotient (IQ)

preoperational stage

psychometric test

sensory-motor stage

chapter that individuals vary in their rate of processing, and this book has stressed

that this rate can be increased with practice. Interestingly, the neuroscience

evidence tends to associate faster processing with lower metabolic expenditure.

The more efficient mind seems to performits tasks faster and at less cost.

In addition to the quantitative component of speed, individual differences

have a qualitative component. People can differ in where their strengths lie.

They can also differ in their selection of strategies for solving problems.We saw

evidence in Chapter 9 that one dimension of growing expertise is the development

of more-effective strategies.

One might view the human mind as being analogous to a large corporation

that consists of many interacting components. The differences among corporations

are often due to the relative strengths of their components.With practice,

different components tend to become more efficient at doing their tasks. Another

way to achieve improvement is by strategic reorganizations of parts of the

corporation. However, there is more to a successful company than just the sum

of its parts. These pieces have to interact together smoothly to achieve the overall

goals of the organization. Some researchers (e.g., Anderson et al., 2004;

Newell, 1990) have complained about the rather fragmented picture of the

human mind that emerges from current research in cognitive psychology. One

agenda for future research will be to understand how all the pieces fit together

to achieve a human mind.

Anderson7e_Chapter_14.qxd 8/20/09 9:55 AM Page 421

21⁄2-D sketch:Marr’s proposal for a visual representation that identifies

where surfaces are located in space relative to the viewer. (p. 40)

3-D model: Marr’s proposal for an object-centered representation

of a visual scene. (p. 40)

abstraction theory: A theory holding that concepts are represented

as abstract descriptions of their central tendencies. Contrast with

instance theory. (p. 140)

ACT (Adaptive Control of Thought): Anderson’s theory of how

declarative knowledge and procedural knowledge interact in complex

cognitive processes. (p. 156)

action potential: The sudden change in electric potential that

travels down the axon of a neuron. (p. 15)

activation: A state of memory traces that determines both the

speed and the probability of access to a memory trace. (p. 156)

affirmation of the consequent: The logical fallacy that one can

reason from the affirmation of the consequent of a conditional

statement to the affirmation of its antecedent: If A, then B and

B is true together can be thought (falsely) to imply A is true.

(p. 276)

AI: See artificial intelligence.

allocentric representation: A representation of the environment

according to a fixed coordinate system. Contrast with egocentric

representation. (p. 107)

amnesia:Amemory deficit due to brain damage. See also anterograde

amnesia; retrograde amnesia; Korsakoff syndrome. (p. 200)

amodal hypothesis: The proposal that meaning is not represented in

a particular modality. Contrast with multimodal hypothesis. (p. 130)

amodal symbol system: The proposal that information is

represented by symbols that are not associated with a particular

modality. Contrast with perceptual symbol system. (p. 127)

analogy: The process by which a problem solver maps the solution

for one problem into a solution for another problem. (p. 218)

antecedent: The condition of a conditional statement; that is, the A

in If A, then B. (p. 275)

anterior cingulate cortex (ACC):Medial portion of the prefrontal

cortex important in control and dealing with conflict. (p. 89)

anterograde amnesia: Loss of the ability to learn new things after

an injury. Contrast with retrograde amnesia. (pp. 146, 201)

aphasia: An impairment of speech that results from a brain

injury. (p. 22)

apperceptive agnosia: A form of visual agnosia marked by the

inability to recognize simple shapes such as circles and

triangles. (p. 33)

arguments: An element of a propositional representation that

corresponds to a time, place, person, or object. (p. 123)

Glossary

articulatory loop: Baddeley’s proposed system for rehearsing

verbal information. (p. 153)

artificial intelligence (AI): A field of computer science that

attempts to develop programs that will enable machines to display

intelligent behavior. (p. 2)

associative agnosia: A form of visual agnosia marked by the inability

to recognize complex objects such as an anchor, even though

the patient can recognize simple shapes and can copy drawings of

complex objects. (p. 33)

associative spreading: Facilitation in access to information when

closely related items are presented. (p. 159)

associative stage: The second of Fitts’s stages of skill acquisition,

in which the declarative representation of a skill is converted into

a procedural representation. (p. 241)

atmosphere hypothesis: The proposal by Woodworth and Sells

that, when faced with a categorical syllogism, people tend to

accept conclusions having the same quantifiers as those of the

premises. (p. 283)

attention: The allocation of cognitive resources among ongoing

processes. (p. 64)

attenuation theory: Treisman’s theory of attention, which proposes

that we weaken some incoming sensory signals on the basis of their

physical characteristics. (p. 67)

attribute identification: The problem of determining what attributes

are relevant to the formation of a hypothesis. See also rule learning.

(p. 301)

auditory sensory store: A memory system that effectively holds all

the information heard for a brief period of time. Also called echoic

memory. (p. 149)

automaticity: The ability to perform a task with little or no central

cognitive control. (p. 86)

autonomous stage: The third of Fitts’s stages of skill acquisition,

in which the performance of a skill becomes automated. (p. 241)

axon: The part of a neuron that carries information from one

region of the brain to another. (p. 14)

backup avoidance: The tendency in problem solving to avoid

operators that take one back to a state already visited. (p. 221)

backward inference: See bridging inference.

bar detectors: A cell in the visual cortex that responds most to bars

in the visual field. Compare edge detector. (p. 38)

basal ganglia: Subcortical nuclei that play a critical role in the

control of motor movement and cognition. (p. 20)

Bayes’s theorem: A theorem that prescribes how to combine the

prior probability of a hypothesis with the conditional probability

422

Anderson7e_Glossary.qxd 8/22/09 8:52 AM Page 422

Glossary | 423

concrete-operational stage: The third of Piaget’s four stages of

development, during which a child has systematic schemes for

thinking about the physical world. (p. 393)

conditional probability: In the context of Bayes’s theorem, the

probability that a particular piece of evidence will be found if

a hypothesis is true. (p. 301)

conditional statement: An assertion that, if an antecedent