Reading Research Quarterly
16, 1980-81, 261-315. Restored and revised July 2005
Design criteria for process models of reading
ROBERT
DE BEAUGRANDE
With
summaries in English, French, and Spanish
Since
reading is a complex activity, many of whose participating processes are not observable in the laboratory; a
crucial factor in research progress is theoretical model design. Unfortunately,
design principles are often left implicit in discussions of the results obtained
by individual approaches. This paper undertakes to clarify the issues involved
by offering and applying a set of design criteria to characterize ten
alternative models of reading or understanding: Chomsky, the Clarks, Gibson,
Kintsch, Meyer, Frederiksen, Schank, Rumelhart, Woods, and the author’s own.
The final section deals with some future research directions that seem to merit
exploration.
Critères
d’étude pour des modèles de procédé de lecture.
Étant donné que la lecture est une activité
complexe et que beaucoup de ses procédés de participation ne sont pas
observables en laboratoire, une étude de modèle théorique s’avère être un
facteur crucial dans le progrès de recherche. MaTheureusement, les principes
d’études sont souvent implicites dans les discussions de résultats obtenus
par des approches individuelles. Ce compte rendu se propose de clarifier les débouchés
impliquées dans la proposition et l’application d’un ensemble de critères
d’étude pour caractériser dix modèles alternatifs de lecture ou de compréhension:
Chomsky, les Clark, Gibson, Kintsch, Meyer, Frederiksen, Schank, Rumelhart,
Woods, et celui de l’auteur. La dernière section traite des directions de
recherche future qui semblent mériter d’être examinées.
Criterios
de diseño para modelos de proceso de lectura
Debido
a que el proceso de lectura es una actividad compleja, muchos de los procesos integrantes no siendo observables en
el laboratorio — un factor crítico para el avance de investigación es el
diseño del modelo teórico. Hay que lamentar el hecho de que las normas de diseño
con frecuencia se dejan implícitas en la discusión de los resultados obtenidos
por intentos individuales. Este ensayo procede a clarificar los problemas
planteados, presentando y aplicando un conjunto de criterios de diseño para
delinear diez alternativas de modelos de lectura o comprensión: Chomsky, los
Clark, Gibson, Kintsch, Meyer, Frederiksen, Schank, Rumelhart, Woods y la propia
del autor. La sección final discute algunas posibilidades de investigación en
el futuro que parecen merecer exploración
* * *
* *
* * *
* *
* * *
* *
Although
they had long been considered self-evident, the notions of “theory” and
“observation” in scientific enquiry have recently been opened to dispute. In
the traditional view, a theory is a set of statements among which logical
inference relations hold (Stegmüller, 1976, p. 2). The theory allows us to make
exact predictions about the domain of application and to test them via direct
observation of independently available facts. It has gradually been admitted
that science does not and indeed cannot work in this fashion, as shown by
considerations of three kinds: historical, philosophical, and psychological.
Thomas Kuhn’s (1970) historical survey demonstrates that scientists have not
usually rejected a theory because of falsifying evidence. Wolfgang Stegmüller’s
(1976) philosophical survey marshals the arguments for supposing that all
observation is “laden with theory,” so that “facts” cannot be
independently available to impartial discovery. And new explorations in
cognitive psychology compel us to concede that all forms of knowledge
acquisition, even sensory apperception of the immediate environment, are heavily
dependent upon our internal systems of belief — our “models of reality”
(cf. survey in Beaugrande, 1980a).
Some
alarming implications of this state of affairs have been hotly contested: (a)
that scientific theories are immune to falsification; (b) that scientific
progress is not cumulative, because each new theory makes us see an
entirely new world; and thus (c) scientists have no rational grounds for
preferring one theory over another, since theories are incommensurable. These
implications cannot be argued away, but their most troubling aspects might be
brought under control. The whole conception of “theory” and
“observation” should be replaced by a conception I call the interaction
of functional diversification with functional consensus (Beaugrande, Note
1). The scientist constructs a theory by designing a complicated conceptual
apparatus to be applied to an envisioned domain of inquiry. The theory is
progressively validated by applying it to steadily more numerous and diversified
domains; the theory is “functional” for any domain when it satisfies our
need for orientation and account to a reasonable degree. “Functional
consensus” can be attained when the theory projects a unifying viewpoint
across all of its domains of application. The problem .of circularity between
theory and observation is not eliminated. But every time a new domain is
modelled, the points where a scientist can inject theoretical
pre-dispositions and the means for doing so are structurally
and functionally altered. Hence, purely artefactual findings dictated by
the theory are not usually transferable from one domain to another, so that
theories can be weeded out after failing to combine diversification with
consensus. In the words of William Estes (1979, p. 47), “the measure of
success in moving toward scientific explanation is the degree to which a theory
brings out relationships between otherwise distinct and independent clusters of
phenomena.”
The
acceptance of this approach would impel us to accord a more suitable degree of
consideration to design criteria in scientific theories and models. The
conviction carried by any theory or model no longer rests solely on the thorny
question of direct, objective falsification; instead, scientists can be
convinced by the potential for representing elaborate processes not accessible
to non-theoretical discovery. Scientific progress can thus be cumulative
even though radical shifts of empirical content occur, provided that the
continuity of design manifests a favourable evolution. By the same token,
scientists can transfer allegiance from one theory or model to another in
recognition of its design improvements — even though available evidence is not
conclusive, and the new theory may have fewer successful applications to its
credit. In this fashion, the alarming implications of studies like Kuhn’s may
be resolved.
I
shall accordingly propose a set of design criteria that can serve to construct
comparative profiles of process models for reading research. These criteria were
selected not because they are logically unified in any axiomatic sense,
but because they are operationally compelling for the model designer who
strives toward a functional, diversified theory of reading. Some of the criteria
have been expressly addressed in the literature, as my references suggest.
Others were postulated on the same grounds as any theory (in the sense proposed
above): they are useful in promoting functional consensus by allowing diverse
domains to be characterized in a common descriptive idiom.
1. Criteria for Designing
Process Models
Processor
Contributions
One decisive aspect is the manner in which the processor-in this case,
the understander reading the text-applies stored knowledge and prior
expectations. In bottom-up models, these contributions are limited to the
analysis of letters, words, phrases, or sentences, while in top-down models,
the processor applies integrative hypotheses about the text, drawing freely from
knowledge and experience of the world to constrain understanding and fill in
materials
Memory Storage: Abstraction, Construction, or
Reconstruction
If understanding is linked to memory, the role of
previously stored materials and the integration of new ones must be clarified.
Three main approaches have crystallized (cf. Royer, 1977). The abstractive
approach maintains that a processor merely extracts “features” or
“traces” from the presentation and stores them away; recall is done by
reviving these traces of the original experience (cf. Gomulicki, 1956; Neisser,
1967). The constructive approach assumes that the processor begins
integrating stored knowledge with the presentation right away, so that memory
receives an expanded, modified version of the experience and presents this when
recall is required (cf. Bransford, Barclay, & Franks, 1972; Ortony &
Anderson, 1975). In the reconstructive approach, further contributions
are still entering after the experience is stored in memory; recall is thus
based on the current state of storage being assembled by means of a general
organizational pattern (cf. Bartlett, 1932; Spiro, 1977).
Utilization
Also worthy of investigation is the extent to which a
processor utilizes presented materials. At one extreme, models call for
thorough, complete utilization:
noticing, classifying, and dealing with every element on every language level.
At the other extreme, models view the processor as using the presentation only
occasionally to find cues and confirm predictions.
Automatization
Due to the speed and complexity of understanding, it
is reasonable to suppose that at least some automatic operations are done
not conflicting with others and not requiring attention (cf. LaBerge & Samuels, 1974). Attentional operations
may remain within the reach of conscious control and thus conflict with others.
A further question is under what conditions control can be imposed upon normally
automatic processes if occasion arises.
Decomposition
Some models show the processor decomposing text
elements (words, meanings, etc.) into configurations of primitives: minimal
building blocks not capable of further reduction. Between the two
extremes-everything is always decomposed versus nothing ever is there are many
positions holding that some decomposition is done for particular motives.
Processing
Depth
Processing depth depends not on readers nor on texts,
but, rather on the tasks assigned. It has be en proven that the amount stored in
memory and recalled varies markedly along this dimension. Words are remembered
better after disambiguating them than after judging their correctness of
spelling (Bobrow & Bower, 1969); finding anomalous meanings aids recall
better than watching for particular sounds or letters (Treisman & Tuxworth,
1974); fitting words into contexts elicits “deeper processing” than finding
rhymes for them (Craik & Tulving, 1975); and making follow-up statements for
a sample is more effective than judging whether the sample is meaningful as it
stands (Mistler-Lachman, 1974).
Scale
Some models address elements only on a local scale,
e.g., recognition of sounds, letters, words, and sentences. Others apply on a global
scale, e.g., the formation of a gist or summary for whole texts. Most
current models are mixed, looking at elements and operations on various scales.
The question then arises of how to correlate the local and global factors.
Power
The term “power” is used by Minsky and Papert
(1974, p. 59) for the capacity to apply general operations and typologies to a
wide range of occurrences. The prime illustration of a high-powered model is the
“general problem solver” developed by the Carnegie-Mellon group of cognitive
scientists (cf. Ernst & Newell, 1969; Newell & Simon, 1972). The prime
illustration of a low-powered approach is the stimulus-response model in which
every activity is elicited by a specific occurrence from the environment.
Modularity versus Interaction
In a modular model there is little
communication among the levels of language, (e.g., phonemes, graphemes,
syntax/grammar; semantics, and pragmatics) or among operations that
understanding traverses in real time. In an interactive model, there is
steady interchange among levels and operations, each one helping to guide the
processes in another. Modular models are easy to design and to test with
straightforward experiments, but, in exchange, their operations are manifestly
cumbersome (Winograd, 1975, p. 192). Operations in an isolated phase or level
readily become explosive if they lack ways for eliminating alternatives. For
example, many syntactically possible ambiguities are at once disallowed by the
semantic level in actual communication. Interactive models are difficult to
design, and render experiments complicated and imprecise, but in exchange, they
profit from internal interchange to eliminate many needless dead-end pathways.
Serial versus Parallel Processing
A related (though not identical) distinction must be
made between doing operations one after the other and doing them concurrently. Serial
models have been popular because of their simplicity for making
measurements; later, the serial computer added a suggestive note. More recently,
the need has been increasingly recognized for parallel models of
processing complex tasks performed in limited time. The fact that serial and
parallel models are mathematically equivalent (cf. Townsend, 1974), though
allowing us to postpone a final commitment, does not settle our questions about
real human activities (Kintsch, Note 2, p. 37).
Freedom
Once a basic process model has been set down, how much
freedom does it foresee among different readers under varying conditions?
How can we deal with idiosyncratic or solipsistic readings of particular
individuals? Can operations be re-ordered or modified on occasion, and can they
break down? Such questions are crucial for a consistent treatment of empirical
evidence.
Openness versus Closedness
A closed model is not amenable to
representing any other kinds of operations than those it already encompasses
(e.g., “autonomous syntax” in linguistic theory). An open model can
be expanded as new insights accrue, without requiring a fundamentally new design
(e.g., models based on general cybernetics). The implication often is that open
models are to be construed as merely representative of understanding operations,
rather than as exhaustive statements. Here, the design of the models themselves
is often more revealing than the claims of the researchers involved.
Logical versus Procedural Adequacy
The “adequacy” of a model is its ability to
operate consistently and free of obstacles or break-downs on particular tasks. Logical
adequacy designates the ability to meet the demands of conventional logic~
(especially predicate calculus): formal proofs, derivations, quantification,
identity, and the like. (See especially Anderson, 1976,’ for discussion.) Procedural
adequacy requires that a model be a workable representation of the
operatlons people might plausibly be performing when they use language (cf.
Schank & Wilensky, 1977, p. l37). These two kinds of adequacy can dictate
quite divergent priorities to a designer.
Learning
A learning model adapts its operations as the
understander progresses through a task. Reading a text appears to be accompanied
by various kinds of learning: refining one’s predictions and expectations,
applying knowledge gained early toward the processing of later sections,
becoming familiar with content and style; and so forth. But this factor
naturally complicates model design still further.
Typology of Materials
While most researchers agree that understanding takes
account of different types of materials (e.g., narrative versus expository
prose), few have expounded conclusive measures for setting up such a typology.
Accordingly, many models are not in fact built to adapt to such differences, or
only to a few obvious ones.
Status of Programming
This criterion simply designates the status of the
model as a functioning computer program, as opposed to being based solely on
observation of humans. At one extreme, researchers want all models stated as
programs (cf. Newell & Simon, 1972; Anderson’& Bower, 1973). At the
other extreme, the computer is flatly rejected for emotional, ethical, or
sceptical motives, or because the distinctions between computers and minds is
deemed irreconcilable. (See discussion in McCorduck, 1979.)
2. Some illustrative models
In
this section the criteria outlined in Section 1 are applied to a selection
of process models of understanding and reading, each model loosely centred
around its principal investigator(s). The selection is intended to be
illustrative rather than complete, and to draw the attention of reading
researchers toward some whose design criteria raise interesting issues. For
example, the discussion of Chomsky and the Clarks suggests the conclusions that
might be drawn generally about models where understanding is factored in terms
of linguistic analysis (e.g., Holmes & Singer, 1961). The treatment of
Gibson’s work might be emblematic for approaches focusing on the bottom-up
recognition of surface features (e.g., Gough, 1972; Venezky & Massaro,
1979). Rumelhart’s recent inquiries are essentially comparable to many
"story grammars" (e.g., Mandler & Johnson, 1977; Thorndyke, 1977,
but compare Beaugrande 1983). I have also chosen models from a broad range,
including for instance memory storage, world knowledge; inferencing, etc.,
rather than restrictive accounts centred, say, on word recognition.
The
outcome of these explorations is assembled in Table 1. In cases where I felt
uncertain about some criterion, I was usually able to consult the investigators
themselves.1 Sometimes the criteria were in fact not applicable
(“N/A” in Table 1), because the model in question had never taken or implied
a stand on the matter. Still, this mode of description may prove a useful
complement to surveys that take each model on its own terms and claims (e.g.,
Samuels, 1979).
Click here for a rotated vertical view of Table 1
The Chomsky School
For
a number of years, the "generative transformational grammar" expounded
by Noam Chomsky was construed as a model of human language understanding, and
occasionally even of reading (e.g., Smith, 1971). A "transformational
grammar" is any language model that runs by converting structures into
other structures through formal rules of arrangement. This basically neutral
format was linked by the Chomskyans to a set of far-reaching assumptions about
the human capacities.2
The processor contributions were subsumed in
the standard model (Chomsky, 1965) under the notion of "competence" as
the understander’s “tacit knowledge of the language”. The design of the
model foresaw a central syntactic processing unit whose input or output was
“interpreted” by subsidiary units for sound and meaning. The basic mechanism
of understanding was the conversion of presented materials (“surface
structure”) into basic patterns (“deep structure”). This mechanism called
for total utilization, since every element had to be accounted for, and for
total decomposition into units incapable of further reduction. The use of the
notion of “abstract automata” by the Chomsky school suggested that all
operations were deemed automatic, but shunned the task of spelling them out.
The design of the model also required working on a
local scale, with serial operations being carried out by absolutely modular
components. The model was closed against such domains as world-knowledge and
human goals. Logical adequacy was exacted by the fact that the whole model was
inspired by formal logic to begin with (Chomsky, 1964; on some implications of
this bias, cf. Beaugrande, 1981). Procedural adequacy was quite poor (Woods,
1970). Power was low because every operation step had to be expressly done with
corresponding “rules” incapable of generalization across wide ranges of
tasks.
Since Chomsky (1965, pp. 3ff) explicitly excluded
memory as “irrelevant,” no stand on storage was taken. Processing depth was
not an issue either: The concept of “deep structure” was not related to this
notion as expounded in Section 1. No provisions were made for variations in
tasks, and all language users were grouped together as a “homogenous
community.” That view also dispenses with freedom, and to a great degree, with
learning. The processing of any sentence was uniform (handled as a random member
of the “infinite set of sentences” that the model was claimed to describe).
By the same token, no typology of materials was envisioned.
The Chomskyan model has fallen into discredit for
various reasons though it has retain a group of die-hards who just keep
tinkering with the design to keep empirical issues ay bay (Beaugrande 1998). The
standard version could not be effectively programmed, though some improved
versions were (e.g., Marcus, 1980). It proliferates alternative formattings to
an alarming degree with no routine processing advantages from converting
structures to other structures of the same type (Beaugrande: 1980a, pp. 33ff.).
It discovers many ambiguities no reasonable human would be likely to consider
(Riesbeck & Schank, 1978, p. 251). And, as already noted, it is closed to
many factors that obviously play important roles in human communication.
Herbert
and Eve Clark
Like the Chomsky school, the Clarks and their associates at Stanford
also work on the supposition that understanding is in some ways comparable to
the analysis done by professional linguists on single sentences. Their work is
broadly eclectic, conforming to linguistic trends and to the exigencies of
tightly controlled laboratory experiments.
The basic activities of understanding include parsing
a sentence into “constituents” (noun phrase, verb phrase, etc.) and building
underlying “propositions.” These propositions are very close to the surface
sentences, though their status is rather confused. At one point, the reader is
told that “each proposition consists of a verbal unit plus one or more
nouns” (perhaps not a bad working definition of a sentence); later,
propositions are written in bracketed notation with “functions” that “are
expressed as verbs” or other parts of speech; still later the reader is shown
“the proposition ‘Mary bought the book from John’, which is represented as
‘Buy (Mary, book, John)’“ (Clark & Clark, 1977, pp. 11, 46, and 114,
respectively). Moreover, “propositions” inherit the “arrangement” of
constituents in the sentence, and are “combined” by means of surface syntax,
e.g., conjunctions and relative clauses (Clark & Clark, 1977, pp. 13ff.). It
is therefore not at all clear whether “proposition” is occasionally just a
vague name for “sentence” and thus a syntactic rather than semantic unit.
All the same, the processor contributions in the
Clarks’ model are not limited to this managing of constituents and
“propositions” in sentences. World-knowledge is admitted in many studies of
how sentence organization is affected when describing visual scenes (Clark,
Carpenter, & Just, 1973; Clark & Chase, 1974). “Inferencing” is
admitted on a modest scale to keep track of the identity of referents (i.e.,
different ‘designations for the same thing) and of relations like
instance-class, part-whole, and cause-effect (Clark, 1977). To keep inferencing
in bounds, only those inferences are considered that the text producer has
presumably “authorized” (Clark, 1978) — a standard explained by examples,
not by a clear definition.
Herbert Clark (in a personal communication) describes
his outlook on memory storage as agreeing with construction and reconstruction.
But he has combated the “constructivist view” as “too inclusive” (Clark,
1978, p. 295). And l cannot see how provisions could be made for reconstruction
on any scale. Instead, he seems to disallow the view when he states that
“comprehension is conceived to be the process by which people arrive at the
interpretation the speaker intended them to grasp for that utterance in that
context” (Clark, 1978, p. 295, appealing to the philosophers H.Paul Grice,
Stephen Schiffer, and Jonathan Bennett, none of whom relies on empirical
evidence from real “speakers”). At most, a reader could “reconstruct”
the author’s “intentions”-not the spirit in which the notion of
“reconstruction” was introduced. Due to the strong concern for linguistic
analysis, utilization is heavy in this model, being done serially on a local
scale. Power is low, since, as we saw, sentences and propositions are so closely
intertwined; the same factor of course encourages interaction of syntax and
semantics. A strong stand was taken on decomposition (E. Clark, 1973), now
softened considerably (Clark & Clark, 1977).
The categories of automatization, freedom, and logical
and procedural adequacy have apparently not been addressed at all, and
processing depth only in passing (Clark & Clark, 1977, pp. 151ff.). Learning
might be possible when understanders adapt to distributions of “given” and
“new information” (Clark & Haviland, 1977). No typology of materials is
offered beyond the sentence. And none of the model has been programmed.
The Clarks’ model is comparatively versatile within
the limitations of conventional linguistics and experimental psychology. They
characteristically prove their assumptions by trying to show that some
understanding operation (building internal representations of a sentence,
resolving ambiguities, making inferences, etc.; cf. Clark & Clark, 1977)
takes time. Accordingly, the model has been steadily accruing in an exemplary
bottom-up fashion that leaves little room for doubting that such operations
exist. The question is rather whether those operations are sufficient to capture
understanding.
Eleanor J. Gibson
Eleanor Gibson’s approach to reading has been
decisively influenced by the work of James J. Gibson, who flatly denies that
human processors contribute anything to what they understand (J. Gibson, 1966).
The only operations allowed are those of detecting, extracting, and utilizing
the “features” within the presented materials. A model of “word
perception” was accordingly set forth as a series of independent phases of
feature extraction in this order: phonemic/ graphemic, then syntactic, and
finally, semantic (E. Gibson, 1971). Later, an elaborate model of reading was
proposed along comparable lines (E. Gibson. & Levin, 1975).
The abrogation of processor contributions has
important consequences for the model’s design. Utilization is heavy
(especially for skilled readers) and, as we saw, modular. Decomposition Is
advanced to a status of marked prominence,3 though it is conceded
that words are not broken down into individual letters (Gibson & Levin,
1977, p. l64) — an extreme view espoused on occasion by Philip Gough (1972).
Scale is naturally local, and power quite low. Significantly, most of the
Gibson-Levin volume is devoted to sounds, letters, spelling, grammar, and
syntax. Meaning figures as little more than “adaptive significance” (Gibson
& Levin, 1975, p. 20) that is “learned as part of a situational context in
which it is invariant with an event of interest” (E. Gibson, 1977, p. l68).
Provisional conclusions might be drawn about criteria
l did not find being treated. Automatization figures only in adult readers
(Gibson & Levin, 1975, p. 9), though the feature-extracting operations have
an automatic quality in general. Processing depth might be accounted for as a
truncation of processing before the semantic stage causing a detriment to
recall. Memory storage would be abstractive. (The Gibson-Levin volume has little
to say about memory, despite frequent recourse to “learning” and
“storing” features.) The illustrations suggest a strong leaning toward
serial processing, both in the order of phases and the stressing of serial media
(letters, word components, phrases, etc.). Learning is accounted for as skill
acquisition through experience and transfer (Gibson & Levin, 1975, pp.
63ff.). A typology of materials could be implied as a grouping of feature
repertories.
Other criteria seem still more doubtful. Logical and
procedural adequacy are both equally remote. None of the model is programmed,
and the computer is viewed with mistrust (Gibson, 1977).
Freedom is accounted for only in the different
developmental stages of children. The model is open toward further operations of
feature extraction, but emphatically closed toward any information-processing
operations (Gibson, 1977).
The Gibsonian denial of processor contributions is
undoubtedly a minority view at present, regarded with discomfort even by
cordially disposed professional colleagues (e.g., Neisser, 1976). In emphasizing
the role of contrastive features and differentiation, the Gibsonian model is
left with few methods for depicting the connection, unification, and integration
that readers must perform; a simple assembly of features into steadily larger
units would most likely be explosive owing to a lack of hypotheses about what
kind of units should be built in a given case.
Walter
Kintsch
We turn now from predominantly bottom-up models to
consider some models more favourably inclined toward top-down operations. Among
the best-known is that developed by Walter Kintsch and co-workers at the
‘University of Colorado. His approach reflects his extensive expertise on
theories of learning, coding, and memory. (See Kintsch, 1977a.) By working with
texts rather than sentences, he has gradually moved away from the classical
procedures of experimental psychology and has applied elaborate techniques of
measurement (described in Miller & Kintsch, 1980).
For a long time, Kintsch’s model did not begin with
the presented text at all. A hand-coded proposition list was prepared as the
basis for study. (Techniques are described in Turner & Greene, 1977.)
Recently, however, he has depicted the treatment of the text in terms of
segmenting into “chunks” according to the number of underlying
“propositions,” Le., predicate plus one or more arguments (Miller &
Kintsch, 1980). Further processing is done by a. “coherence graph generator”
that connects propositions into a network, a “fact organizer” that matches
up input to stored “world knowledge,” an “inferencer” that “fills in
missing propositions,” and a “control structure” guiding the
“macro-operators” that produce the “gist” of the text (Kintsch, Note 2,
p. 13). In particular, “schemas” are applied to give an abstract outline of
a story or event sequence (cf. Kintsch, 1977a, pp. 373ff.; 1977b; Kintsch & van
Dijk, 1978)..
We can see that utilization is heavy, and that memory
storage will be likely to be both constructive and reconstructive. Propositions
are studied both on local and global scales, e.g., when one proposition takes
others as its arguments, leading to a hierarchical network (cf. Kintsch &
Vipond, 1979). Power is high as a result of the general operation types, such as
“deleting” (removing), “generalizing” (subsuming), and
“constructing” (restating) of materials during the formation of a steadily
more global representation (cf. Kintsch, 1977b; Kintsch & van Dijk, 1978).
The model runs in parallel, with emphasis on procedural adequacy. Learning is
manifested in the cumulative operations that lead to the gist.
Rather surprisingly, Kintsch (Note 2, p. 46) describes
his own model as modular. This designation is based on the need to test only one
part of it in a given experiment. In the domain of theory, the model is
decisively integrative: How, for instance, could human readers separate their
knowledge of “facts” from their “inferencing”? Modularity is also needed
because only some parts of the model have been programmed: the “coherence
graph generator” (running) and the ‘‘fact organizer” (in progress) —
no parser or inferencer is available.
Kintsch agrees that factors such as automatization,
processing depth, freedom, and a typology of materials are vital considerations,
but he has not yet tried to account for them in his own work. Regarding
decomposition, he argues that it can be done if required, but is not carried out
routinely (Kintsch, 1974). He consistently notes that his model is open. It
would not be hard to add on a parser, for example.
The appeal of Kintsch’s model lies especially in its
unusual breadth that correlates functional diversification with functional
consensus (Anderson, 1976, p. 55). His handling of semantics — an important
weak point in the models examined previousl — is elegant, powerful, and
humanly reasonable. Unlike the Clarks, Kintsch is willing to disregard effects
of surface syntax to some degree, and classifies sentences only on the basis of
their number of underlying propositions. l do not agree that his experiments
“are not tests of strict deductions of the theory,” but only “studies in
search of a theory” (Kintsch, 1974, p. 243). As argued at the outset of this
paper, the notion of “theory” should be modified. It might be argued that
different validation procedures are required in this kind of research; clinging
to the old paradigm of statistical predictions followed by direct observation
would make us discard some of our best data and insights.
Bonnie June Francis Meyer
The work of Bonnie Meyer and associates at the Arizona
State University takes its distinctive cast from the positions adopted in regard
to processor contributions, scale, and power. She postulates that each text
possesses an inherent “structure of content” yielding an ordered hierarchy
of importance. Reading operates most efficiently if this hierarchy is discovered
and utilized. She began by constructing intuitive hierarchies for sample science
texts and asking independent judges to rate the importance of the constituent
propositions; there was fairly high agreement among the judges about relative
importance, and the higher-ranking propositions by and large were best and most
often recalled by diverse groups of readers (Meyer & McConkie, 1973).
The act of reading is envisioned along the lines of
Joseph Grimes’s (1975) discourse analysis, using a “case grammar” of verbs
and verb adjuncts to handle the surface text. The words of the text (at least
the verbs and their adjuncts) are taken as “lexical predicates” whose
“arguments” are “ideas from the content of the text”; these
predicate-argument combinations are “lexical propositions” that in turn
become arguments of “rhetorical propositions” (cf. Meyer, 1975, pp. 26ff.;
Meyer, 1977, p. 317). By making those “rhetorical propositions” into
arguments of more and more general propositions, a hierarchy of importance is
eventually obtained.
In consequence of having changed from an abstractive
outlook (e.g., in Meyer & McConkie, 1973) to a constructive or
reconstructive one (Meyer, personal communication), she is evasive about whether
the “top-level structures” (which she equates with “schemas” in the
sense of Richard Anderson, 1977)5 are situated in the text or in the reader.
Some processor contributions of readers are doubtless hidden away in the
“importance judgements” elicited regarding passage structure. Meyer admits
that a text may follow the “top-level structure” in a “normal” or
“distorted” way, as has in fact been shown by Perry Thorndyke 1977), so that
readers must be able to make such distinctions. For example, having the main
ideas at the beginning of a text is considered “normal” (Meyer &
Freedle, 1979, p. 3). Meyer’s solution is to appeal to the schema used by the
author: Readers who fail to use the same schema (for instance, because they
reject the author’s arguments, or are simply unskilled) will not perform as
well (Meyer, 1980). Meyer’s model is apparently most valid for readers who
follow the author’s guidance. Her group surmounts the technical difficulties
(knowing what the author intended) by authoring their samples themselves around
various “top-level structures.”
In agreement with this overall framework, utilization
is viewed as medium heavy, and scale and power are mixed as needed to build a
hierarchy. Interaction is possible at least between the “lexical” and
“rhetorical” components, and parallel processing is implied (though not
clearly demonstrated). Decomposition is not postulated.
Meyer’s undecided stance regarding the relationship
of texts and readers makes some of the model criteria difficult to apply:
Automatization, freedom, and adequacy have either not been addressed or are onIy
now coming under investigation. In exchange, she places great emphasis on a
typology of materials designed to reflect “top-level structures,” e.g.,
“adversative” (comparing a favoured view to an opposing one),
“covariance” (relating preconditions to their outcomes), “response”
(stating a problem and offering a solution), “attribution” (depicting the
traits of an object or event), and so forth. She suggests on occasion that these
different types may elicit variations in processing depth (Meyer & Freedle,
1979, p. 19). Learning is also acknowledged to be a factor of text organization:
she indicates that “recall is facilitated when passages [of a text read
progressively] have the same structure but different content, while it is
inhibited if passages have different structures but the same content” (Meyer,
1977, p. 3l0). She suggests that traditional learning experiments on “serial
position,” “primacy,” and “recency,” or “proactive and retroactive
inhibition”-in effect, whether recall depends on the order of
presentation-were inconclusive because of the failure to consider hierarchical
structure of content (Meyer, 1977, pp. 308ff.).
Meyer’s model is the last one considered here which
has not been programmed at all. In her case, as in that of the Clarks and
Gibson, the possibility exists that creating a program would force a
clarification of issues now left indistinct (cf. pp. 46ff.). Still, Meyer’s
model is open to admitting more reader activities, and empirical results are
impressive and intriguing. Her exploration of global organization was clearly a
pioneering effort at a time when few other researchers had realized the
importance of this factor.
Carl Frederiksen
Like Meyer, Carl Frederiksen (now at McGill
University) and his associates are concerned with the evolution of propositions
from a presented text. But Frederiksen’s model differs from Meyer’s in
significant ways, notably in the development of more detailed typologies of
materials. He defines “propositions” as “semantic relations” that either
“identify objects or actions” with respect to space, time, attributes,
parts, and the like, or “specify the elements of an event”: the “case
roles” such as agent, object, patient, or the like (Frederiksen, Frederiksen,
Humphrey, & Ottesen, Note 3, p. 4). He sees propositions as moving upward in
a scheme of “ranks,” including “event frames” (whatever participates in
an event), “relative systems” (propositions connected by a function), and
“dependency systems” (propositions connected by logical, causal, or
conditional relationships) (Frederiksen, 1977, p. 65).
The act of reading is represented as follows: The
reader approaches a text segment (usually, but not obligatorily, a sentence),
finds the propositions, and assigns them to classes. There follows a “first
stage” of “inferences” that resolves questions of reference (e.g.,
pronouns, substitutions) and ambiguity. Then more elaborate inferences are made
in three basic categories: “Connective” inferences “relate a current
proposition to propositions derived from the prior text”; “extensive”
inferences “extend current propositions by generating new propositions based
on prior knowledge and discourse context”; and “structural” inferences
“segment and organize a text, building a coherent model
of the text as a whole” (Frederiksen et al., Note 3, pp. 5ff.).
Utilization of the presented text materials is
variable according to the presence of “discourse features that require or
elicit inferences,” such as conjunctions, deictics, pronouns, and so forth.
It is obvious that processor contributions are heavy,
since most of the organization that Meyer sees as inherent to a text is done
here by the reader’s inferencing operations. This approach lends
Frederiksen’s model an unusual flexibility in regard to bottom-up or top-down
processing. Indeed, the find of “schema” or “top-level structure” that
applies here constitutes itself directly from the interaction of text features
and reader operations. Such a view is opposed to reading as a matching of static
patterns with fixed slot configurations. Memory ~storage is naturally
constructive and reconstructive...
Scale is mixed according to the scheme of “ranks”
(a notion of Michael Halliday). Considerable power resides in the notion of a
“system” as a functioning repertory of options, such that any structure is
viewed as a series of decisions (Fredericksen, 1977, p. 59). Frederiksen also
relies on high-powered typologies capable of subsuming the details of semantic
units and relations on all levels. And he now recognizes a still higher-powered
reader awareness of one’s own procedures, allowing for modification and
adaptation during operation.
The model’s heavily interactive design renders it
open to treating far more issues than it now encompasses. Procedural adequacy
dominates, but does not exclude logical.6 Learning is called for
constantly when readers consult previous propositions in understanding and
organizing subsequent ones (Frederiksen, 1977, p. 70). Freedom is treated by
comparing recalls of children at different ages in order to measure prior
knowledge and reading practice (Frederiksen et al., Note 3).
To avoid committing his model to a fixed set-up of
compulsory procedures, Frederiksen (in a personal communication) takes no stand
on automatization, processing depth, or serial vs. parallel processing. He would
admit decomposition as a possibility, but his model does not use it.
The programmed version of the model is far from
completed. It segments the text, but queries a human user about presence and
classification of propositions for each segment. A “text file” is built from
these propositions, and an “inferential review” notices “common
elements” and creates a coherent network. Some samples are reprinted in
Frederiksen (1977).
Like Kintsch and Meyer, Frederiksen is actively
engaged in reading education. He uses “natural experiments” in which
children’s reading is studied in “a wide variety of naturally occurring
discourse situations” (Frederiksen et al., Note 3, p. l0). His samples are
actual school texts children normally read. Such research is an important
counterbalance to the restrictive, artificial laboratory tasks with isolated
sentences.
Roger Schank
Unlike the models discussed hitherto, the approach of
Roger Schank and co-workers (now at Yale) was devoted, from its earliest
inception, to building functioning computer understanders. Openly repudiating
the low-powered, bottom-up models of conventional linguistics and psychology,
they designed understanders that reason extensively about the organization of
actions and events in order to read stories and answer questions about or
summarize what was read. They also rejected the view that memory is
“semantic” — that is, a neat dictionary of classifications — and argued
in favour of “episodic memory” for systemizing one’s own personal
experiences from day to day (Schank, 1975a).
The first large-scale program was called MARGIE
(Memory Analysis and Response Generation in English) (Schank, Goldman, Rieger,
& Riesbeck, 1975). The understander would start reading words, recovering
underlying concepts, drawing inferences, and making predictions (“requests”)
about what should be expected — all this before the end of a sentence was
reached (Riesbeck, 1975). The mainstay was the stored knowledge about what
constitutes actions and events: how they update the state of the world, and what
participants (“conceptual cases” such as “objective, directive, recipient,
or instrument”) should be anticipated and used to fill slots (cf. Schank,
1975b, pp. 30ff.). Rich inferencing was done right away about causes, effects,
intentions, and motivations (Rieger, 1975).
In more recent programs such as SAM (Script Applier
Mechanism), the processor contributions were upgraded in scale and power to the
applying of “scripts” as global sequences of routine actions commonly done
in human affairs (e.g., a visit to a restaurant) (Schank & Abelson, 1977;
Cullingford, 1978) In PAM (Plan Applier Mechanism), another technique was to
reason about the goals of story characters in general (Wilensky, 1980). These
approaches lent more direction and purpose to the activities of understanding,
and reduced the volume of local inferencing to be done.
Schank’s approach is so heavily constructive that
the utilization of the actual text is quite light. Riesbeck’s (1975) MARGIE
parser did no syntactic analysis except what was useable right away for
conceptual understanding. In a new version called IPP (Integrated Partial
Parser), the parser reads only some words, skips others and saves them, or else
skips them altogether (Schank, Lebowitz, & Bimbaum, 1978). Words are judged
“interesting” if they generate expectations troublesome issue for models
like Gibson (1971) and Gough (1972).
The recognized words are fed into a “parser”
containing knowledge about “case frames,” i.e., patterns of units associated
with a word (usually a verb) (Rumelhart & Levin, 1975, pp. 171ff.). The
parser builds a format called an “augmented transition network” (cf. Woods,
1970; Kaplan, 1975) (shown later in Figures 1 through 7). This network has
states (nodes) representing words, and arcs (links) representing syntactic
relationships. The parser invokes the “interpreter” that gradually assembles
a semantic network composed of basic primitives like “do”, “cause”,
“change’’, and so on (cf. Rumelhart & Levin, 1975). Each event or
action has a “schema” of these primitive components to organize incoming
knowledge (Norman & Rumelhart, 1975, pp. 406ff; Rumelhart, 1977b, p. 266).
Like Schank’s group, Rumelhart and his associates
felt a need for some higher-powered, more global way of guiding understanding.
The notion of “schema” was stipulated to represent the general line of a
story with “settings” (statement of time, place, and characters of the
story) and “episodes” (event plus the protagonist’s reaction) (Rumelhart,
1975). Soon, this definition was expanded to incorporate problem-solving
procedures (Rumelhart, 1977b; Rumelhart & Ortony, 1977). Now, the story line
is driven by the protagonist’s “trying” to attain a “goal” set up in
response to some “event.” The story outline is thus a hierarchy of episodes
(event + response of setting up goal + try for the goal + outcome) that cause or
contain each other. It was shown that readers making summaries used the
materials corresponding to the upper levels of this hierarchy (Rumelhart, 1977b
) — a finding Meyer obtained from her own standpoint. (See above.)
The processor contributions in Rumelhart’s model are
therefore stored patterns often termed “schemas.”8 This approach
leads to a constructive and reconstructive view of memory storage (Norman &
Rumelhart, 1975a, p. 23). Such factors as scale and serial vs. parallel
processing hinge upon the size of schemas and their order of application (on
parallel use of multi-scale independent “knowledge sources” via a “message
center,” cf. Rumelhart, 1977c). Schemas may work for both global and local
input, and be applied concurrently; with or without knowledge of each other. The
strong stand on decomposition taken in the 1975 volume has been attenuated:
Decomposition would be done only if the components of a schema being matched
were smaller than the input units. Much of the schema-matching is believed to be
done automatically.9
As was remarked in connection with Schank’s work (p.
280), processing depth is humanly plausible, but hard to capture in a processing
model. Rumelhart and Levin (1975, pp. 203ff.) suggest a “variable depth
strategy” which “allows information to be stored at different levels in the
decomposition process.” They envision “partial comprehension” that leaves
ambiguities unresolved in hopes of finding out more later on. Another approach
would be to allow for differences in utilization according to the reader’s
motivation to account for the input with more or less completeness and
thoroughness: Here, the matching of schemas would simply be less demanding.
Freedom could easily enter on a local scale when different readers are all
working for only a superficial or general account; on the global scale, however,
he found that readers’ hypotheses tend to converge as soon as a story becomes
reasonably unambiguous (Rumelhart, Note 4).
The appeal to abstract pattern-matching operations in
all areas of understanding lends Rumelhart’s model substantial power: At some
degree, all processes can be viewed as essentially the same. In such a
framework, interaction is easy to introduce (Rumelhart & Norman, 1975b, p.
l59; cf. Rumelhart, 1977c); and the model is open to many extensions. Once
attained in one domain, procedural adequacy can readily be generalized to
others.
Gibson and Levin (1975, p. 40l) misunderstand the
notion of “schema” as a denial that readers can “extract any new
information from what we read.” Rumelhart has pointed out that learning is
always possible: The abstract outlines of schemas can be filled or combined in
totally unique new ways. Moreover, schemas .can evolve as they are “tuned”
to new experiences (e.g., becoming more general or specific, or refining their
range of expected variables);new schemas can be induced from experience or
patterned after old ones, and even old schelpas do not preclude memory traces of
particular occurrences. (Discussion is given in Rumelhart, 1980.)
As of 1980 present, Rumelhart’s model was far from
running as a complete computer program. For one thing, the model lacks access to
sufficient machine memory. The surface text is hand-parsed into a case-like
representation. And only a few programs have been written to apply story schemas
to simple examples.
Rumelhart’s approach, like that of the groups around
Kintsch and Schank, is commendably broad. But unlike those groups, his concerns
extend to the most basic mechanisms of recognition and storage.10
Reading is therefore seen to be situated within the domain of human information
processing at large. It is greatly to be hoped he will eventually provide an
integrated, start-to-finish version of his model.
William
Woods
Though designed to run entirely on the computer, the understanding
models developed at Bolt, Beranek, and Newman (not by accident in Cambridge, MA)
under the supervision of William Woods follow quite a different line of
reasoning from those in Schank’s group. Woods (1970) was among the first to
recognize that the augmented transition network (see Figure 1-7 below) was
superior to the dominant transformational model for parsing English syntax.
Lately, he has been working to generalize this format for all the subsystems of
language understanding in parallel: “one for acoustic phonetic recognition,
one for lexical retrieval (word recognition), one for syntax, one for
semantics, and one for subsequent discourse tracking” (Woods, 1978b, p. 35).
Each subsystem should be able to consult the findings of the others about
ongoing structures, and thus to drastically reduce the number of alternatives to
be considered. In fact, operations that would yield structurally comparable
configurations in different subsystems (or along different paths of the same
system) are merged on the spot and thus done only once-a scheme Woods (1978b)
calls “cascading networks.”
According to the most recent version I could obtain
(Brachman, 1979a), the reading process would run as follows. A user types
English sentences into the computer to be syntactically analysed with a parser
designed by Rusty Bobrow (1978). This parser works through the premodifiers in a
noun phrase or verb phrase until the head is found. The results are sent to the
semantic stage in the reverse order (head first). Then any post-modifiers are
picked up until the phrase is completed. Finally, specifications are sent along:
determiner and number for a noun; or tense, aspect, and (if present) negation
for a verb. All the word classes are identified with the help of a syntactic
taxonomy (list of types). By consulting the semantic stage early during
syntactic processing, the understander avoids setting up syntactically
acceptable, but semantically anomalous
readings.
The semantic stage applies concepts with an intricate
internal organization designed by Ronald Brachman (1978). Its salient traits
include specifying the “roles” of a concept (e.g., parts of an object or
“participants in an event) — a kindred approach to Rumelhart’s “cases”
and Schank’s “conceptual dependencies” — and an elaborate capacity for
“inheritance” of knowledge between a class and an instance (e.g., human
beings/Napoleon) or between a superclass and a subclass (e.g., animals/human beings)
(cf. also Fahlman, 1979). These traits lend the semantics a remarkable power
combined with a compact, easily searchable memory store.11
The outcome of the semantic stage is a “literal”
interpretation of the meaning of sentences in a structured network. The
“discourse expert” then takes care of pronouns (anaphora), quantification,
and speech-act recognition. This last operation (worked out for computers
especially by James Allen, 1979) is done by hypothesizing a plan that the user
should be following — or even by “inferring what the user wants the system
to think is his or her plan” (Brachman, 1979a, p. 20).
Processor contributions in this model are not just
prior knowledge: More important are skilled techniques for organizing operations
on the spot. Consequently, the model attains unrivalled power, interaction, and
learning capabilities. The grammar of procedures continually refines itself the
longer it is used, and the specific tasks at any moment refine themselves still
further (Woods, 1978, p.22). Thus, although the model is open, it may not
require any more major theoretical additions. Woods (1978, pp. 63-70) shows how
it can be generalized beyond -the domain of language understanding to encompass
such areas as visual scene analysis (e.g., light intensity or. hue) and acoustic
frequency recognition. Noting is needed besides ways to “structure” and
“index” a “perceptual domain” with “operations” that “probe
transitions” to identify “states” and’ “constituents” (especially
“initial” and “final” states in a unit), and to keep records of what is
hypothesized and found.
In such a
bigh-powered approach, handling scale and procedural adequacy posed no separate
difficulties. Utilization is somewhat heavy, but never wasteful. Decomposition
is much less important than assembly. Parallel operation is built in with
unmatched elegance. In return, the design implies no decisions regarding
automatization, processing depth, freedom, or typology of materials. Plainly,
such an elaborate program must, for the present anyway, rely on uniform
treatment of readers and their memories. Only a restricted amount of the English
language can be dealt with so far.
Unfortunately, Woods’ understander is known chiefly
to specialists, and has influenced just a few reading models (e.g.,
Rumelhart’s and mine). Its merits emphatically recommend it to a wider
audience, particularly its extraordinary power and economy. Such a model is
manifestly able to encompass a far wider domain than it has been managing to
date, and suggests how a host of issues might be handled that have scarcely yet
been recognized by many researchers.
Robert
de Beaugrande
My own search for a process model began in 1974, when
l undertook an account of the operations involved in translating poetry.
Although an outline model was eventually propounded (Beaugrande, 1978a) in terms
of reader aesthetics, many of the issues required a more detailed and general
theory of text production and comprehension, stated within the framework of
cognitive science rather than old-style linguistics.
l have now assembled far more complex outline models
for texts and text processing (Beaugrande 1980a, 1984, 1997). Unusual emphasis
has been consistently devoted to issues of integrative design: Reading is seen
as an interaction of phases of processing dominance, i.e., as a correlation of
processing types sharing the processor’s cognitive resources in varying
distributions. The phases are: parsing (identifying the grammatical dependencies
of the surface text), concept recovery (associating language expressions with
cognitive content), idea recovery (building the central conceptual configuration
that organises content) and plan recovery (identifying the plans and goals that
the text is intended to pursue) (cf. also Beaugrande & Dressler,.1980, ch.
3). Any dominant phase freely consults the results of non-dominant ones, so that
grammar is continually correlated with meaning, meaning with action planning,
and so on. l do not see these phases receiving dominance in a neatly fixed
sequence; instead, dominance is probably passed back and forth frequently, as
particular aspects enter the focus of processing, e.g., because results seem
unsatisfying.
To support functional diversification and consensus, l
have argued that the main activity is connectivity search: Given any textual
occurrence (be it a letter, word, clause, concept, discourse action, or
whatever), how does it relate to other occurrences in that subsystem or in other
subsystems? The duration, intensity, and thoroughness of this search are always
dependent on the reader’s demands and motivation, and the relevance of the
text to current tasks and goals. The model should accordingly address some
idealized degree of processing while being adaptable to a range of
approximations among individual readers. Here, the ideal representation is
clearly the network, not the tree, the lists, the set, or the formula: A network
can retain connectivity even if some elements are left unlabeled or omitted
(e.g., skipped in reading or forgotten afterward) — precisely what readers
seem to do quite often (cf. Masson, 1979).
The initial processing unite is not the sentence
(after all, many texts are not composed exclusively of sentences, especially in
speech), but rather the stretch of text that can be comfortably held in working
memory under current limitations of attention, familiarity, and interest. It
would be important to learn whether that stretch is or is not usually a
sentence, but we must not assume so in advance. Other candidates deserve
consideration: a clause, a group of short sentences, a long noun phrase or verb
phrase, a chunk of propositions, a discourse action, and so on. The goal of
processing cannot be syntactic analysis, but rather building a model of a textual
world. By this l mean that a reader reconstitutes a more or less extensive
“world” containing the situations and events depicted by a text along with
various amounts of prior knowledge and assumptions that might reasonably apply
to such a world. Like any “model,” the text-world model of a particular
reader may have varying degrees of correspondence to whatever is postulated as
complete or ideal understanding of the text. Yet all readers’ models will
plausibly share some “family resemblance” with each other and with the text
producer’s own model. In this perspective, criteria for model design enter
both into the theory of reading and into the ‘actual object of investigation.
Imagine someone reading the following passage from a
children’s story:12
(l) A great black and yellow rocket stood in a desert.
Empty, it weighed five tons. For fuel it carried eight tons of alcohol and
liquid oxygen.
(2) Everything was ready. Scientists and generals
withdrew to some distance and crouched behind earth mounds. Two red flares rose
as a signal to fire the rocket.
The ideal first step is to recognize the words of the surface
text and parse them into a structure of grammatical dependencies (a structure
with at least two grammatical elements, one of which cannot stand alone). That
task can be represented with an “augmented transition network” that moves
through a sequence trying to predict and identify the elements being
encountered. Presumably, the phrase is the usual unit, since it provides a
fairly well-defined pattern in English. The opening phrase “a great black and
yellow rocket” might be parsed as shown in Figure 1.

Figure
1. An augmented transition network for a noun-phrase macro-state
On encountering the determiner “a,” the processor enters
a macro-state of “noun phrase” and keeps predicting the “head”; when the
head is not found. tbe next highest search priority is for a
"modifier."13 This sequence is represented by dotted lines
for failed predictions and solid lines for the confirmed ones. The lines at the
top of the figure are for the grammatical dependencies such as
"modifier-to-head," giving us the labeled links for a dependency
network.
A
comparable network could be set up for the verb phrase "stood in a
desert." Eventually, the grammatical dependency network shown in Figure 2
could be obtained by discarding the left-to-right chaining and keeping only
relational links.

Fig.
2. A grammatical dependency network
The
arrows are pointers to the nodes designated by the abbreviated labels. This much
seems straightforward enough, but, in fact, a number of disturbing questions are
hidden away here. Does a reader build a network for a whole sentence without
consulting the meaning? Or is meaning consulted after each phrase (as is apparently called for in the Norman & Rumelhart
version)? Or does each dependency correspond to a list of candidate relations in
the conceptual domain? Could a reader bypass this kind of analysis and go
straight to the meaning? Or are all of these methods possible?
My
own tentative conclusion is that all of these methods are possible, but,
according to the circumstances, not equally plausible nor workable. Readers
probably have a set of preferences for correlating grammatical
relationships with conceptual ones (cf. Beaugrande, 1980a, p. 88), and the
organization of phrases or clauses could easily be consulted in modelling the
organization of situations or events. But interactions like these are asymmetrical,
i.e., not usually one-to-one and hence never fully reliable (cf. p. 295);
still, we customarily have constraints imposed by one domain upon
another.
To build a text-world model, the reader would set up
configurations of concepts connected by relations, e.g.,
"agent-of," "location-of," "attribute-of," etc.
(see more extensive listing in Beaugrande, 1980a, pp. 82ff.) Here, the reader
can use networks whose nodes are not surface words, but rather concepts for
which the words we see are only names, that is, instructions to activate
knowledge in memory. For the material underlying the opening sentence, the
reader would have the text-world fragment shown in Figure 3.14 The
ensuing processing would steadily expand this configuration by adding onto
already created nodes where feasible, in this case, mostly to the
"rocket" node. Thus, the material underlying the whole first paragraph
would yield a model space shown in Figure 4.

The dense linkage around the "rocket"
designates it as the topic concept: an operational control center for
storing, using, and recalling the text's content.
Thus far, the text-world model might appear to contain
no concepts not explicitly activated by the text. However, such a restriction
could not be upheld very long. The second paragraph opens with material whose
relationship to the foregoing is not obvious: what is subsumed by
“everything,” and how do “scientists” and “generals” enter the
scene? Two solutions are possible: to set up fragments and wait for some later
account of relationships, or to use inferences for connecting the material as it
comes along. Let us assume that the latter solution is more plausible. Two
reasonable inferences can be entered (using square brackets in the nodes
involved) that “everything” includes whatever enables the rocket’s
“take-off,” and that the “scientists” and “generals” were there to
“observe” the rocket.15 If linkage is added to the effect that
the “ready” state was the reason for the “withdrawing,” “crouching,”
and “signaling,” the integrated configuration shown in Figure 5 is obtained.

Figure
5. Two merged model spaces with inference nodes
Proceeding in this fashion, a network for an entire
text can be modelled (see illustrations in Beaugrande 1980a, 1980b), but this
design should suffice to illustrate the advantages and disadvantages of the
approach. A main advantage is that an ideal pattern of concepts and relations is
obtained against which the patterns of readers’ reports of the content can be
matched. For example, one reader began a protocol like this:
In a desert, a rocket waited to be launched: it weighed five
tons empty. The generals and technicians stepped back behind dirt mounds
and launched two red flares signalling the launch of the rocket.
An account of this report in terms of surface texts or single
propositions would be quite intricate, and not very general, But a matchable
pattern with a few simple conventions can be designed: Put recovered nodes in
the positions corresponding to the original model, mark alternative concept
names with “*”, set inferred nodes in square brackets, and the outcome is
represented in Figure 6.

Fig.
6 Section of a reader’s recall protocol
Continuing in this fashion for a set of reader reports, l can
get some graphic impression of the typical ways that readers make sense out of
whatever they understand and remember when connecting it into a coherent
configuration. For instance, it appeared that readers tended to lose whole
spaces, rather than isolated nodes and links (in this sample, the material
regarding the fuels and “everything” being “ready”); that readers tend
to fill in links among the concepts they do retain (in this sample, the
launching” of ‘‘flares’’ was assigned to “generals and
technicians”); and that, by and large, readers are more concerned with
maintaining coherence than with reproducing faithfully what has been read.
Another technique l have used is to design for the
text-world model a matching “world-knowledge correlate” containing facts
readers would be likely to know before encountering the text: that rockets
takeoff and bum fuel, that burning causes heat and thus makes people seek
shelter, that generals use rockets to attack, while scientists use them to
explore outer space, and so forth. A correlate for the sample might look like
Figure 7.

Fig.
7. A section of a world-knowledge correlate
Again, nodes not present in the text are identified with
square brackets. And a distinction is made between determinate relations (marked
with a Greek delta) for necessary components, and merely typical relations
(marked with a Greek tau) for usual, but not indispensable components of
commonsense facts.16
Knowledge of these particular facts is no doubt
influential, but it does not guide processing with any- major scale or power.
Included is a schema for “flight” as a sequence of events: “take off,”
“ascend,” “reach a peak,” “descend,” and “land” are the
indispensable components shown in Figure 8.

Figure
8. A schema for flight
Suppose that upon encountering a “rocket,” readers
activate the “flight” -schema and predict that the schema events will be
mentioned. Notice, however, that the opening text fragment does not in fact
mention any of these events, but rather describes the scene and some incidental
preparations. When subjects reported what they read, they very often moved some
mention of the “take-of’ event right up to the beginning of their protocols:
Such was done in the first sentence of 13 protocols within one group of 42
readers. This kind of processing is anticipated by inserting that inference node
for “take-off’ in Figure 5. It is hypothesized that the events of the schema
are regularly inserted into a text-world model as control centers for
integrating and organizing what the text actually says. (This graphic is given
in Beaugrande, 1980a, p.
The disadvantages of the representation are not minor.
Numerous tests showed that readers tend to visualize whole scenes: The setting
was specified as “sandy plains” “under a bright sun,” where a rocket was
on its “launching pad.” If slides of rocket launchings were shown while the
text was read aloud, these scenes were influenced; yet the presented
network fails to capture this “visuality” of discourse, which was not
integrated in terms of the mainframe of text study until my New Introduction
(Beaugrande 2004).
Another drawback is that the words of the text are
retained as concept names, so that the underlying meanings (e.g., of “black”
or “fuel”) are not specified. If primitives were used (e.g., those proposed
by Schank and Lehnert), the configuration would look quite different, especially
the links. (For instance, the connection between “fuel” add “fire” or
“fire” and “flares” would become obvious.) Surely some portion of
coherence is attributable to relationships among the basic components of
knowledge (cf. Gentner, 1978).
A third disadvantage is that the model remains static,
even though the text world changes every time an event takes place. For
instance, when the take-off happens (in a later part of this text), most of the
facts stated in these opening paragraphs are no longer true. Some means should
be developed to capture the updating of a textual world as reading progresses
(e.g., that would convert the location of “rocket” from “desert” to
“sky”). It might emerge that the changes readers make when building their
own models of text content reflect the effects of updating in regular ways.
One final drawback deserves notice: that the
representation does not account for “reference” as a correspondence of
concepts to some real or possible objects and events in the world of everyday
experience. If the text were a report of an actual flight, and readers had been
present there, they might be judging the various assertions as true vs. false,
or accurate vs. inaccurate. One conjecture is that reference is done for whole
model spaces, not for individual words and concepts, because context is required
for efficient matching. But since humans often do not have such access to the
reality, they are skilled in building models of that domain also. Hence,
reference would be a matching of
models, and truth (or accuracy) a threshold of correspondence that people in
communication are disposed to demand.
One group of readers was tested some six weeks after
they recalled the “rocket” text. They were asked to decide the “truth”
of statements like “A rocket always burns fuel” and “A rocket is often
used to attack military targets.” The outcome was disrurbing. Only two
statements were received with unanimous agreement (“Without fuel, a rocket
can’t fly” and “Launching sites usually are located on the earth”). In
all other cases, people couldn’t agree whether the facts were always, often,
or sometimes true. Average US readers are certainly uninformed about rocketry
— they even have the “rockets’ red flare” in their National Anthem. Yet
people do much of their reading about areas in which they are not experts, and
reading models must respect that factor. Traditional logicians’ models of
reference are unrealistic. Our concern should rather be with “general design
principles, which appear to be important in the effective utilization and
integration of large amounts of intrinsically uncertain knowledge” (Allport,
1979, p. 61).
I have deliberately attacked my own model to show the
pressing need for further evolution. In that regard, one major contribution of
any theory or model (beyond its
potential for representation and discovery) is its ability to reveal the mode of
its own transcendence. Endel Tulving (1979, pp. 29ff.) observes:
There is every reason to believe that all current ideas,
interpretations, expIanations, and theories are wrong, in the sense that
sooner or later they will be modified or rejected . . . . Supporting, retaining,
and affirming existing theories longer than necessary more often than not stands
in the way of progress. Experiments aimed at existing theories should be
designed to find out how, where, and in what sense these theories need revision
or why they should be rejected.
3. Toward an Ideal Model?
To adequately explore reading, a necessary first step
is a firm definition of the notion of “text”. It cannot be not just a series
of sentences, as one is often required to assume. A text is often both more and
less than the sum of its parts. And people communicate quite successfully with
texts even though these parts are fuzzy and non-determinate in isolation, e.g.,
the meanings of individual words and concepts being subject to dispute. The
definition posited here is: the text is an actual system, that is, a
working system in which decisions and selections have been made such that the
various occurrences have some function(s) in contributing to the operations of
the whole. This system is quite different from a virtual system like
syntax, which merely states the available options. In an actual system,
occurrences mutually constrain each other and thus constrain what text users can
do. To disregard these constraints — for instance, by assigning one’s own
personal meanings to words, like Humpty Dumpty — is to abuse the system and
endanger its operation. Some latitude is of course allowed, according to a
principle that might be compared to cybernetic regulation: A change in a system
elicits some compensatory actions elsewhere. (For instance, speakers can invent
their own meanings if they stop and
explain them.).
Each subsystem of a text (e.g., lexicon,
grammar/syntax, concepts, plan steps, discourse actions, and so forth) runs
partly on internal principles and partly on requirements of feedback from the
others. This interaction of subsystems is regular, but asymmetrical, lacking
fixed one-to-one correlations of operations, functions, or elements (cf. p.
289). When many elements in one subsystem would each suffice alone to signal a
higher-power single function in another subsystem, redundancy results.
The stability of a text as a system is constituted by
a functional continuity of occurrences within or among its interacting
subsystems: The function of each occurrence (its contribution to the system’s
operation) must somehow be relevant to that of others, for instance, by
providing access, imposing constraints, signalling contingent decisions, and so
on. The significance of occurrences or of configurations of these, such as word
meanings, sentence formats, speech-act purposes, and so on, can be viewed as an
ordered set of hypotheses about appropriate processing actions. When these
actions have been carried out to the satisfaction of the processor, a threshold
of termination is reached, and operations are directed eIsewhere.
It is difficult to believe that the staid appeal to
the “author’s intentions” is really decisive. (Compare the use of the
notion in Frederiksen, 1977; Clark, 1978; Freedle & Meyer, 1979.) The author
must not only “intend” to elicit processing actions; he or she must also
know how to do so successfully. The intentionality must somehow be actualised in
the text as a configuration that imposes a characteristic demand for reading
tasks to be done with certain results. The author can influence the probability
of those results, but cannot force them to come about.
The same point can be made at the readers’ end. Many
researchers fear that by downgrading the author’s intention, we open the door
to utter anarchy, as in Herbert Clark’s (1978, p. 295) whimsical case of
someone being “adventitiously reminded of her mad Uncle Harry.” But, as l
already noted, abusing a system renders it dysfunctional, and communication
could break down. And how are readers to know the author’s intention except by
enacting the text?
l have tried to specify the nature of the text-system
according to seven principles of textuality. Cohesion is the principle
that connectivity should obtain among the surface elements of the text (sounds,
letters, syntax). Coherence is the principle that connectivity should
obtain among the underlying concepts and relations. Intentionality is the
text producer’s attitude that a cohesive, coherent text is being created for
some goal; acceptability is the corresponding attitude on the text
receiver’s part. Informativity results from the extent to which text
occurrences are not probable nor predictable in their context. Situationality
is the relevance of a text to a current or recoverable situation. Intertextuality
is the principle whereby the utilization of a given text depends on knowledge
and experience gained from using other texts (specifically or in general).
For an extensive discussion of these principles, l can
only refer to my previous work (Beaugrande, 1980a, 1984, 1997, 2004; Beaugrande
& Dressler, 1980). It must suffice to note here that each of these
principles is chiefly devoted to continuity, which, in most cases, is not openly
enacted in the surface text. Accordingly l have appealed to the notion of problem-solving
as a general search and discovery method for connecting systemic states (NewelI
& Simon, 1972). A problem is defined here simply as a state from
which the attainment of a desired successor state is not certain or obligatory.
Problems may be trivial if their solution is easy to find, or serious
if the chances for failure (non-attainment of the envisioned state) outweigh
those of success.
It must be stressed that a text with no problems at
all, hence, a system with total stability, is neither possible nor desirable. No
text can make every connection explicit. And every text is in some ways unique.
But even a text with only trivial problems and occurrences of high
predictability would be of little value, and readers would lose interest very
soon. At the other extreme, a text with numerous serious problems and almost no
predictability would overload the processing capacities of normal readers. Thus,
the topmost control factor for creating a text must be to keep a suitable degree
of unpredictability and problematicity for the expected audience. It is in this
sense that the author sets tasks for the readers and speculates on the outcome.
Communication is presumably driven by the constant removal and restoration of
satisfactory (but not total) stability. There may even be some neurological
impulse to escape the stagnation of cognitive systems; boredom is naturally
unpleasant.
It should now be clear that no model of reading can
dispense with assuming heavy processor contributions, even to deal with the
surface text. Phrases, clauses, sentences, coherence, purpose — all these must
be enacted, not just found. Moreover, there must be operations for
contributing additional material. Spreading activation ensues without special
command when some element is placed in working memory, that is, its active
status “spreads” to other items closely associated in stored knowledge
(Collins & Loftus, 1975). Inferencing is done whenever there is a specific
discontinuity to be bridged. (See Warren, Nicholas, & Trabasso, 1979.) The
distinction that might be drawn (and not all researchers do so) is one of
directedness and conscious control. Though usually cited in dealing with
conceptual coherence, these two operation types might be envisioned as running
in any textual subsystem, and with any degree of scale or power. It would follow
that part of the problem-solving posited in my model is performed without effort
through spreading activation (cf. Ortony, 1978).
We must now confront the issue of automatization.
Following LaBerge and Samuels (1974, p. 295), we can define “automatic”
processes as those done “while attention is directed elsewhere.” If
“attention” is in turn defined as an expenditure of processing resources
that inhibits the potential for other tasks at the same time (Keele, 1973), we
can conclude that automatic processes do not compete with whatever other
operations the mind must perform simultaneously. Clearly, at least some areas of
understanding must run automatically, since the brain has a limited capacity for
complex tasks (cf. Posner& Snyder, 1975, pp. 64ff.). But it is probably too
strong to claim that it is all automatic (as Anderson & Bower, 1973, suggest
for memory). As a working hypothesis, it might be argued that conscious control
is retained over high-powered processing, while low-powered tasks are done by
automatized subroutines. Between the two domains there would be a flexible
threshold that moves up or down depending on how much time and attention can be
made available. A possible corollary is that most laboratory experiments tend to
manipulate this threshold and hence to yield unnatural results (cf. p. 302).