Original version in Text 1(2), 1981, 113-161 Revised version July 2005

 

Linguistic Theory and Meta-Theory

for a Science of Texts

 

ROBERT DE BEAUGRANDE

 

Abstract

This article explores the typical reactions which occur when an established science confronts a new object of inquiry, as when linguistic theory encountered the text. The usual discussions are not productive as long as the old ‘paradigm’ is still accepted as the framework for achievement. The issues are therefore re-examined in terms of the meta-theory of science (e.g. Sneed, Stegmiiller, Lakatos, Feyerabend, Hempel), and some general solutions are expounded for the problems of validating theories on the basis of empirical content. A paradigmatic example is then presented in order to show a possible role for logical linguistics in future theories: a computer grammar that parses text-sentences into a progressive network and back again via theorem-proving, with further capacities for applying schemas, answering’ questions, or generating summaries. This example may serve as an application of general design values and criteria for preferring and comparing alterative theories.

 

1. Historical background

1.1. When a new object of inquiry1 presents itself to an established scientific discipline, there is not likely to be any immediate consensus either about the nature of that object or about the most productive theories and methods for treating it. The scientists themselves typically fall into three groups of response. The first group denies admission to the new object of inquiry on the grounds that it is no proper concern of science, either because it lacks any systematic nature or because it falls under already established concepts of superior status. The second group is willing to admit the new object of inquiry under the proviso that current theories and methods can be extended or modified to encompass it. The third group undertakes the far more arduous task of designing new theories and methods to suit the new object of inquiry, such that its nature can be captured in the most direct and insightful fashion. Since each of the three groups is making assumptions about ‘objective truth’, each one tends to view its own position as the only reasonable and correct one. Consequently, the groups often ‘talk right past each other’, to borrow Stegmüller’s (1976: 159) phrase.

1.2. The three-way confrontation just descried has arisen with regard to ~admitting the lal as an object of linguistic inquiry:

(a) Some scholars denied that texts are proper objects, since their nature equals that of super-long sentences accessible via adequate or complete sentence grammars (e.g. Katz and Fodor, 1963; Dascal and Margalit, 1974).

(b) Some scholars hoped that the prevailing transformational and logical theories could be altered to shift from sentences to texts (e.g. Petöfi, 1971;: van Dijk, 1972; Ballmer, 1975).

(e) Some scholars elected to set aside prevailing sentence theories in search of new theories more directory amenable to thee special considerations of text and discourse2 (e.g. Winograd, 1972; Norman and Rumelhart, 1975; Schank et al., 1975; Coulthard, 1977; Kintsch and van Dijk, 1978; Petöfi, 1979; Beaugrande 1980a, b; Beaugrande and Dressler, 1981).

1.3. Understandably, some linguists simply ignored the text, though non-response might legitimately be classed as a special ‘ kind of response. Some of this group might be genuinely unaware of the new trend; but such blissful ignorance is becoming steadily harder to maintain. The encroachment of text and discourse into linguistic research must be viewed in correlation with an enduring scientific crisis in the ‘standard theory’, at least as postulated by Chomsky (1965). Such a crisis ensues when the prevailing theory is no longer considered an adequate tool for solving research problems. There ensues a phase of extraordinary research, described by Stegmüller (1976: 144):

More and more new formulations and versions are probed. There is a willingness “to try anything.” Open expressions of discontent are heard. Foundational discussions commence, and refuge is sought in philosophy.~

All of these symptoms have been familiar for over a decade: new formulations such as ‘generative semantics’ (e.g. Lakoff, 1971); ‘case grammar (e.g. Fillmore, 1968; 1977); ‘stratificational grammar’ (e.g. Sampson, 1970); ‘relational grammar’ (e.g. Cole and Sadock, 1977); ‘arc-pair grammar’ (Johnson and Postal, 1980); expressions of discontent (e.g. Oller, 1970; Robinson, 1975); and foundational discussions with appeals to philosophy of science (e.g. Dresher and Hornstein, 1976 vs. Winograd, 1977, and Schank and Wilensky, 1977; Fodor, 1978 vs. Johnson-Laird, 1978).

1.4. In such a context, the individual scientist faces a difficult decision about the best kind of response. On the one hand, any further effort on behalf of the standard theory may well turn out to be futile and prematurely obsolete. On the other hand, the standard theory still dominates a sufficient proportion of the scientific profession that scientists may at least temporarily risk their professional livelihood by turning to another theory.3 The continuing failure of the standard theory to handle what Kuhn (1970) would call anomalies (experiences that fail to meet the scientist’s normal expectations) is only part of the picture: no theory will ever be rejected until there is an appealing successor to replace it. Robert Stockwell (1977: 196) concludes his pot-boiler presentation of ‘transformational syntax’ with the ominous comment that

scholars will cling tenaciously to an explanation, a principle, or a “law,” that they know to be wrong, because they do not have in hand an alternative explanation which is clearly better in two crucial ways [...]: (1) it must cover the same range of facts, or an enlarged range. of facts; and (2) it must do so more simply, more satisfyingly in some sense that is ultimately aesthetic, than the hypotheses currently in use.Scientists must after all have some theory in order to function at all.

1.5. But where is the new theory to come from? The most naive answer is: from an impartial observation of the facts about language. Unfortunately, it is now widely conceded that impartiality is always impeded by the necessity of consulting some (implicit or declared) theory in order to decide what the facts are. Hanson (1958) has introduced the notion of the ‘theory-ladenness of all observational data’. A disturbing consequence of this notion has kindled a fierce controversy in philosophy of science: that an established theory may be immune to falsification. A scientist who rejects a theory outright because of its failures, and does not at the same time substitute a new theory, will merely be discredited as a scientist; ‘inevitably he will be seen by some colleagues as “the carpenter who blames his tools’” (Kuhn, 1970: 79). Moreover, failures are often blocked in advance because a theory is free to determine its own applications — the rule of autodeterminism (Stegmüller, 1976: 13)— and hence to exclude immediately the domains where it is likely to fail. By a small (though not necessary step), we arrive at autoverification (Stegmüller, 1976: 159): the theory verifies itself by obliging us to find the ‘facts’ it predicts.

1.6. The impact of autodeterminism on research is revealed in the now familiar explosion of mini-theories in linguistics and language psychology, each mini-theory accounting for only one tiny domain of language. We encounter short papers bearing such titles as ‘two theories about adjectives’ (Kamp, 1975); or, more often, the titles promise us ‘notes’, ‘some remarks’, ‘a few comments’, and the like. Individual scientists protect themselves from being discredited: they only address a few ‘facts’ which a mini-theory can hopefully survey. The outcome is a wealth of findings that lack what Neisser (1976: 7ff.) terms ecological validity: relevance to ‘culture’ and to the ‘main features’ of an object of inquiry ‘as they occur in ordinary life’.

1.7. The standard theory of ‘generative transformational grammar’ specifies for itself quite a large domain of application: to ‘explain our ability to produce and understand a virtually infinite number of sentences of unlimited length’ (Dascal and Margalit, 1974: 105). This goal emerges from defining a natural language as an infinite set of sentences whose well-formedness is decidable with reference to a grammar. As headman Chomsky (1957) himself pointed out, these definitions entail. very strong assumptions; but they enable us to treat some important aspects of language by recourse to an already available type of theory: axiomatic logic. A set of formation rules produces a set of axioms as basic formulas; and theorems are created by the finitely frequent application of derivation rules to the axioms (Stegmüller, 1969: 35). Chomsky introduced linguistic equivalents for each notion: ‘phrase-structure rules’ (formation rules); ‘kernel sentences’, later changed to ‘deep structures’ (axioms); ‘grammatical’ or ‘well-formed sentences’ (theorems); and ‘transformational rules’ (derivation rules).

1.8. The advantages of having a well-defined approach already available were immediately obvious. But it was less obvious ‘that the theory itself was entering into a remarkable relationship wih its domain of application. Carnap (1966) was proposing to use axiomatic logic as the metatheoty for all scientific theories: scientists could set up their terms in precisely the same way that logicians construct a formal language (classifying symbols):If this project is accepted, then the object of linguistic inquiry (natural language) becomes simultaneously a theory; and conversely, any theory is formally equivalent to a transformational grammar. The methods for ‘generating’ a grammatical sentence (i.e. by ‘assigning it a structural description from the grammar’) are exactly the same as those for proving a scientific statement in the kind of science envisioned by Carnap; well-formedness in the grammar is structurally identical with truth in such a science. It is a small wonder then if transformational grammar seemed to meet standards of scientific authority far better than any other linguistic theory and discouraged the individual scientist from challenging it.

1.9. Transformational grammar also illustrates the extremely intricate and precarious role of falsification in scientific enterprises. Empirically discoverable data were sentences of a language. Their treatment involved stating the rules which would reduce them to (and ‘generate’ them back again from) the axiomatic ‘kernel sentences’ or ‘deep structures’. Suppose now that someone else presented a new group of sentences that cannot be treated in the way these rules provided. This occurrence was never taken as a falsification of the theory, but only as a call for more rules, i.e., as a motive for quantitative rather than qualitative changes. Demands for the admission of qualitatively new factors were usually deflected by assigning those factors to ‘performance’, a domain officially excluded from the theory (cf. illustration in 1.11).

1.10. The theory thus remained safe from any serious refutation untilsome researchers attempted to offer it also as a theory of human processes: people produce and understand real sentences along the same lines as does the standard transformational grammar. The task of proving this claim was entrusted to the burgeoning field of ‘psycholinguistics’. Notice that due to the design of the theory the object of enquiry is the theory — we have here the most extreme possible case of the ‘theory-ladenness of all observational data’ (cf. 1.5). Moreover, there was initially no alternative theory under consideration, so that refutation was (as Kuhn would conclude) utterly inconceivable. But commonsense intuitions slowly assumed the role of counter-arguments. In particular, it seemed intuitively implausible that the basic mental entity underlying actual sentences could be a purely syntactic simplification. It was far more plausible that such an entity would be a representation of the meaning. This intuition gradually emerged in the alternative theories of ‘case grammar’ and ‘generative semantics’, although both adopted a large portion of the design of the standard theory. Further pressure arose when research failed to confirm a very basic tenet of the standard theory: that people can uniformly and reliably distinguish between grammatical and ungrammatical sentences of their language (see, for example, Carden, 1970; Heringer, 1970; Ringen, 1975; Greenbaum, 1977). What speakers are apparently able to do is to imagine possible contexts, i.e. meaningful, purposeful situations in which a given sentence might or might not be used (Bolinger, 1968; McCawley, 1976; Snow and Meijer, 1977). And this broad range of’ knowledge is nowhere accounted for in the standard theory: students of grammaticalness also vary in specific ways: (a) the order in which sentences are presented to test subjects affects judgments (Greenbaum, 1973); (b) the sentences appearing together in a text influence our perception of the grammaticality of any one of them (van Dijk, 1977b); (c) sentences which elicit mental images are more likely to be judged correct (Levelt et al., 1977); and so forth.4

1.11. All of these findings and intuitions constitute anomalies for the standard theory (cf. 1.4). They place it under pressure to respond, though the theory cannot actually be dislodged until a viable alterative is actually presented and known. Depending on which of the three groups (elaborated in 1.1) a scientist belongs to, three classes of response to the anomalies are likely to be made. Those wishing to preserve the status quo at all costs will narrow down their claims about the empirical content of a theory. This response is well typified by Dresher and Hornstein’s (1976: 328) assertion that ‘a study of competence abstracts away from the whole question of performance, which deals with problems of how language is processed in real time, why speakers say what they say, how language is used in various social groups, how it is used in communication, etc.’ This response effectively dissolves the standard theory’s responsibility to have any empirical content whatsoever, because no language samples are empirically found except in some kind of communication in social groups. Itkonen (1976: 56) follows along in denying that linguistics can be an empirical science, because grammars of ‘correct’ sentences are constructed with no consideration of time and place. It is intriguing that Dresher and Hornstein dismiss all research other than the standard theory as ‘unscientific’: they not only mean that such research falls outside their own scientific paradigm (as shown by Winograd, 1977) but they are probably appealing also (at least unconsciously) to the formal equation of transformational grammar with scientific theory itself, based on axiomatic logic (cf. 1.8).

1.12. This hard-line response is, I surmise, no longer the consensus among linguists (Lakoff, 1978). For one thing, it leaves the linguist very little to do except to design rules and compare them to his or her own personal intuitions about grammaticality. The most interesting claims (about human language knowledge) are dropped, and, in consequence, ecological validity is eliminated (cf. 1.6). Thus, even if researchers lack an alternative theory, they are assailed by grave doubts about why they should bother at all with the standard one. It is not surprising therefore that many linguists adopt the second kind of response: to alter the standard’ theory in such a way as to make empirical content accessible. One such response was van Dijk’s (1972) early proposal for ‘text grammar’ to replace ‘sentence grammar’. I shall briefly inspect this case because it elicited Dascal and Margalit’s (1974) airy counter-response in favour of upholding the status quo (thus placing them in the first group of scholars depicted in 1.1). We are told that ‘the properly linguistic facts of texts’ can be ‘taken care of by a ‘complete sentence grammar’ which not now available, you understand, but yet to arise in some indefinite future time will ‘solve’ ‘all the problems related to producing sequences of syntactically independent sentences’ ‘without needing any special additions’ (Dascal and Margalit, 1974: 82, 86). Two ways were envisioned in which the text will be absorbed by sentence grammar: (a) as one superlong sentence (the idea of Katz and Fodor, 1963); and (b) as analogous to the ‘deep structures’ of the standard theory,5 which, being given as unordered groups of interrelated simple sentences, are ‘like van Dijk’s texts’ except that for van Dijk, ‘order is a factor of major importance’ (1974: 92ff.).

1.13. This method of refutation leads into another issue of major import ‘. for scientific meta-theory: terminology and concepts. An established theory in what Kohn (1970) called ‘normal science’ has the right to specify the conventional terminology, and it, can insulate itself from falsification by means of this prerogative. Generative semantics, for instance, was first downplayed by hard-liners of the ‘standard theory’ as being a mere ‘notational variant’ of the latter, i.e., another way to represent the same thing (e.g. Katz, 1971). Dascal and Margalit (1974: 82) follow just this tactic in arguing against van Dijk (whose affinities to generative semantics were readily evident) when they argue that ‘the new problem reduces to a mere terminological issue’ (a case they decry as ‘degenerating’).6 Robert Stockwell (1977: 197) who, as we noted in 1.4, already seemed to foresee the downfall of the transformational approach, tried to save the situation by ‘hoping’ that ‘the differences’ between ‘all theories’ ‘will increasingly turn out to be matters of notation and terminology’. My own conjecture, to be elaborated in sections 2 and 3, “.is that these ‘differences’ will more likely be matters of empirical content (cf.2.24ff.).

1.14. Dascal and Margalit’s counterattack is instructive in its format. They break down van Dijk’s (1972) arguments for a text grammar into three categories: (a) methodological, (b) grammatical. and (e) ‘psycholinguistic’. The methodological arguments are disposed of straightaway on the grounds that van Dijk’s (1972: 22) notion of ‘naturalness’ will be offset by a detrimental increase in ‘complexity’.7 The grammatical arguments are met with the contention that the factors adduced by van Dijk such as definite/indefinite retiles, pronominalisations, and presuppositions can be treated by the suitable ‘complete’ sentence grammar promised us for the indefinite future (cf. 1.12). Van Dijk (1972: 132) seems to admit as much when he says: ‘the rules, elements, and constraints that permit a correct combination of subsequent sentences’ ‘could be introduced into a sentence grammar without profound changes in the form of the grammar’. Thus, Dascal and Margalit’s refutation of the first two categories of argument appears successful, provided one accepts the standard theory’s stipulation of linguistic science. I shall argue later that this stipulation is invalid in a way not yet foreseen in van Dijk’s dissertation dating after all from an early and long since abandoned stage in his agile career (cf. 1.16; 2.13; 3.1ff.).

1.15. Van Dijk’s ‘psycholinguistic’ arguments for text grammar hinged upon the ‘semantic coherence which surpasses the micro-structural or local constraints on its individual sentences’ (1972: 132). He envisioned a ‘global plan’ ‘underlying actual speech (utterance production)’. But he wants to retain the basic logical foundations of linguistic theory: this plan is to be ‘a grammar’ to ‘specify a level of sufficient abstractness, generatable by a finite set of rules which determine the production and reception of extremely complicated sequences of a text’; ‘this level can appropriately be considered a semantic deep structure of the text’ (1972: 132ff.). This logical bias naturally retains the prominence of sentences as linear units: ‘text surface structures can become gradually and linearly more complex, while their deep structure will roughly keep the form of a sentential proposition’ (1972: 140).8 Along with generative semanticists, van Dijk proposes to treat meaning in terms of axiomatic logic: ‘any “minimal” meaning structure, be it micro-structural or macro-structural, has the (modalized, quantified) “proposition”-form’ (1972: l40f.). It is therefore understandable that Dascal and Margalit view ‘deep structures’ — or, as van Dijk soon came to call them, ‘macro-structures’ — as ‘huge logical formulae’, and their formation rules as ‘those of a modelled version of the predicate calculus’ ‘with extra categories, such as “actant”, “text qualifier”, etc.’ (1974: 104).

1.16. It should now be clear that van Dijk’s text grammar and the prevailing ‘standard theory’ share the extremely strong assumption that axiomatic logic embodies human communication, rather than merely being one possible tool for representing it. Thus, both want to make the object domain structurally identical with the theory itself (cf. 1.8). It follows that empirical research is not so much a test of anybody’s theory as a test of the representational powers of axiomatic logic. The same difficulties which the standard theory encountered regarding ‘grammaticalness’, ‘well-formedness’, ‘correct derivation’, etc. will overtake any linguistic theory making the same strong assumption, whether the specific entities involved are texts or sentences.

1.17. Perhaps the whole notion of text grammar is disproportionate because it takes grammatical categories as central to all others (cf. 3.4). Consider two predicaments that van Dijk’s adherence to logic-based grammar engenders. He ‘suggests’ that ‘the formal criteria for text coherence are similar to those of deductive and inductive derivations, where preceding sentences (presuppositions) can be considered as the premises of a given sentence as a conclusion’ (1972: 10). Dascal and Margalit (1974: 115) construe this to mean that texts presenting invalid arguments must by that very fact be incoherent — a direct violation of obvious intuitions. Many important texts do contain invalid arguments, often deliberately so (e.g. Swift’s A Modest Proposal), and yet are perfectly coherent; and in far more cases, the formal validity of arguments is neither decidable nor relevant to textual communication. Therefore, van Dijk’s semantics of coherence ‘suggested’ here is empirically unworkable.9

1.18. The treatment of pragmatics is similarly unsatisfactory:

 The input here are sentences (or discourses [!]) as specified in the syntax plus their semantic interpretation as given in the semantics. Such discourses are OBJECTS and as such cannot be called successful or non-successful. A first task of pragmatic theory, therefore, is to turn these objects into acts. In other words: what has been the abstract structure of the utterance-object must become the abstract structure of the utterance-act. It would be nice if the structure of the former could somehow be maintained in the structure of the latter, just as rules of semantic interpretation respect the categories of syntactic structure. The operation turning discourse into acts might also be called a PRAGMATIC INTERPRETATION of utterances. A second task of pragmatics would then be to “place” these acts in a situation: (van Dijk, 1977a: 190).10

1.19. Once again, we have a host of empirically unworkable assumptions. Discourses are actions not objects (not even sentences qualify as objects, cf. Morgan, 1975): and we need no ‘pragmatics’ to ‘turn them into’ what they already are by nature. Discourses are not specified by the syntax, but by their situation of occurrence in communicative interaction; we need no ‘pragmatic interpretation’ to ‘place’ them where they are intrinsically located to begin with. The ‘success’ of a discourse is not a matter of formal conditions, but of operational utilization in regard to human goals. All these empirically necessary and reasonable insights are denied here merely because ‘it would be nice if the structure’ of discourses should turn out to ‘maintain’ ‘the structure’ of axiomatic, logical objects. But this kind of direct correspondence cannot be found in any representative group of sample discourses: we can at best apply some suitably operationalised form of logic to reconstruct the functionality of discourses (cf. 3.8). .

1.20. We have now looked at two kinds of response which linguists might make with regard to the introduction of text and discourse into their science: outright rejection vs. small extensions or modifications of prevailing theory. I conjectured in 1.3 that the encroachment of texts follows from the pressure upon linguistic theory to define its application and empirical content: all actually available samples occur in text and discourse.11 I have argued that the standard theory makes an unduly strong assumption which in effect collapses the domain of application together with the theory itself. However desirable and unifying this assumption may be for building a grammar, it has not and cannot garner sufficient empirical content. For that reason, texts cannot be treated by a linguistics of sentence syntax which simply multiplies the same fixed structural identities further into semantics and pragmatics, or into steadily longer sequences of sentences.12 I shall undertake to propose an alternative in sections 2 and 3 by first considering the status of linguistic theories in terms of the philosophy of science, and then presenting an operational logic grammar as a paradigmatic example.

2. Scientific metatheory and theory design

2.1. ‘Observable data’ are not usually such that any normal, healthy human being could notice and identify them; observing data usually presupposes extensive specialized experience in handling theories. An experimental researcher cannot be ‘a scientifically unsophisticated person into whose experimental data theoretical considerations do not enter’ (Stegmüller, 1976: 25). Yet large portions of scientific enquiry imply what Stegmüller (1976: 23ff.) describes as the ‘dual level theory of scientific language’, with separation made clearly between some sort of ‘observational language’ (as a ‘sense data language’ whose statements are ‘absolutely certain’ and ‘decidable by observation’) and a ‘theoretical language’ (defined either negatively as being non-observational or else vaguely as ‘serving a particular purpose’ ‘within the framework of a theory’). Both ‘levels’ of this ‘scientific language’ are troublesome: Putnam (1962 1962) notes that the concept of ‘theoretical terms’ has not been clearly defined. Carnap’s above-mentioned proposal to imitate the logician making a ¡. formal language (cf. 1.8) will fail because the classification of symbols must be done before a theory can be formulated; hence, such problems arise as (a) a concept can then no longer be introduced as ‘relative to a theory’; (b) ‘various theories can be formulated in one and the same language’; and (e) ‘one and the same term can be theoretical with respect to ,one of these theories, but. Non-theoretical with respect to some other’ (Stegmüller, 1976: 28). However, ‘observational language’ is impossible to separate from some (acknowledged or unacknowledged) theory (cf. Feyerabend, 1960: 71). Hempel (1971) offered to repIace the notion of ‘observational language’ with the more neutral one of an ‘antecedent available vocabulary’.

2.2. Stegmüller (1976: 36) argues that the relationship of theory and observation can be resolved only by abandoning the conventional ‘statement view’ of a scientific theory, in which ‘empirical scientific theories are classes of statements, some of which can be established as true or false only by empirical means’; and ‘the logical relations between the statements of an empirical means’; and that ‘the logical relations between the statements of an apparent straightforwardness and rigor of this view mask the grave problems of verification/falsification, logical consistency, completeness, and decidability, etc. Kuhn’s (1970) historical survey shows how serious such problems are in actual scientific practice. Stegmüller (1976: 19) accordingly proffers the’ ‘non-statement view’: 

holding a theory means having a complicated conceptual apparatus (and not sentences or propositions) at hand; namely, a core which one knows has already been used successfully a few times in core expansions and which one hopes can be used in the future for still better expansions (“normal scientific belief in progress”).

The ‘core’ of a theory is said to be ‘expanded’ if the application of the theory is extended via special laws or constraints to some new subdomain (1976: 114ff.). Now, empirical tests (core expansions) may fail without giving ‘proof that the core is worthless’; it follows that ‘a theory is not the sort of entity of, which. it Can sensibly be said that it has been falsified (or verified)’ (1976: 19).

23. In the non-statement view, both theoretical language and observational language are redefined in regard to applications of a theory. Sneed (1971) defines ‘T-theoretical terms’ as those ‘whose value can be calculated only by recourse to a successful application of the theory T’ (Stegmüller, 1976: 15). There are three possible kinds of status which some domain of application can have:13

(a) Partial possible models are the ‘physical systems about which the theory is talking’, and statements can be made about them in everyday language (cf. Finke, 1979: 42ff.).

(b) Possible models are the systems described in a language enriched with T-theoretical terms.

(e) Models of the theory in the strict sense are only ‘those entities satisfying the basic mathematical structure of the theory’.

2.4. Stegmüller undertakes to demonstrate that Sneed’s approach offers a way out of otherwise inescapable dilemmas. The question .of what constitutes a ‘theoretical term’ is immediately decidable. The theory itself is not a set of statements, but a core with many possible expansions. The ‘set of applications for the theory need not be ‘extensionally given as a fixed domain, but as a rule only intensionally, and indeed then only via paradigmatic examples’ (1976: 19). The correlation between theory and application is not left vaguely uniform, but differentiated into three classes of ‘models’. In this fashion, even Kuhn’s conjecture about the lack of falsification in theory testing becomes rationally accountable: statements (or inferences among statements) may be disproven and still leave the theory core intact (2.2).

2.5. Sneed’s (1971) presentation was based on the science of mathematical physics. But recently, scholars have proposed that it could be deployed to re-establish linguistic theory also (e.g. Schnelle, 1976; Finke, 1979). One immediate advantage can be noted in the context of my remarks in 1.8: the short-circuiting of a statement-view theory into the application domain (natural language) is no longer insightful. Instead, the axiomatic logical grammars could be replaced by core grammars (cf. Haber, 1975). Such a grammar no longer tries to stipulate all possible sentences of a language (much less excluding all impossible ones); rather it tries to stipulate the probable grammatical patterns (and their relative probability values). Hence, the ominous quandary of non-uniform sentence judgments can be treated as non-decisive (cf. 1.1 O). And language intuitions would not enter (either with or without acknowledgement) directly into the theory: they could instead be provisional candidates for partial possible models, and, following successful applications, possible models as well.

2.6. Again, we should not repeat the strategic mistake of assuming that language (or communication) has the same structure as the neutral set theory logic of Sneed. Instead, we should seek to design possible models based on all the insights into language that we can obtain. A set-theoretical model such as Sneed advances entails only very weak empirical claims: that a domain has constituent elements. Aside from the option of assigning some element(s) to the elements of a set the basic concept of function in that framework — the constituency relations are left largely unspecified. One can employ the ordered set to determine linear priority and access. Or one can provide a quantitative differentiation (a probability value) to constituency by means of fuzzy set theory. However, all of these uses are essentially logical, not operational, and thereby neglect the crucial operationality of human language.

2.7. A systems theory combines the basic claim of set theory (presence of constituent elements in an entity) with this claim of operationality. Hence, function is now definable as the contribution of an element to the workings of the system. This definition removes the requirement that elements be stable objects with a fixed identity: contributions to an operation may easily be altered, shifted, reset, or redistributed. This factor is extremely important for a theory of communication, precisely because linguistic elements can and do undergo functional shifts from one context to another. In logical terms, communication could be said to involve many variables, whereas and constants can, given appropriate motivation, be redefined as variables (cf. Rumelhart, 1980). This factor suggests why language theories based on static objects are to cumbersome to be workable for real communication (cf. Note 1; 1.19; Morgan, 1975).

2.8. If a system is studied and compared in several of its states, and a theory is advanced about how states evolve from one to another, we obtain a process theory. If processes are then stated as sequences of steps to be performed in real time (as some sort of ‘operating instructions’), we attain a procedural theory. If we then actually run the procedures to compare the results with the empirical domain, we arrive at a simulation theory. Each stage in this theory progression entails steadily more specific and explicit claims about the domain of application: how it is constituted and why, how it evolves, what tasks it performs in which sequence, and so forth.

2.9. Assuming for the moment that theories of cognition and communication should be developed through this progression of states such that simulation is eventually made feasible then a theory in the sense of Sneed would have to start off with qualitative rather than quantitative applications. In that case, ‘T-theoretica1 terms’ would not be (at least not at first) those ‘whose value can be calculated only by recourse to a successful application of the theory T’ (cf. 2.3); rather they would be those whose operational function(s) (in terms of systems theory) can only be explicated with regard to the theory. Application would not rest only on measurement, but a successful simulation of tasks or state evolutions. Possible models could be devised as those which meet the procedural demands of the theory.

2.10. These implications for the theoretical aspect can be correlate, those for the observational aspect. The ‘theory-ladenness of all observational data’ postulated as early as 1958 by Hanson (cf. 1.5) has found strong confirmation in recent cognitive psychology .14 All human apperception of the environment (including that of scientists) is heavily dependent upon theories about the environment (cf. for example, Mackworth, 1976; N 1976; Rumelhart, 1977; Havens, 1978; Allport, 1979; Johansson, 1979; Norman, 1979; Beaugrande, 1980a). The raw sensory data fed to the human organism (e.g. visual, acoustic) is often non-determinate and noisy in itself. Neisser (1976: 44) depicts ‘rich environments’ as ‘situations that support two or more obviously different perceptual cycles’. Consequently, humans treat the ‘real world’ as a ‘partial possible model’ of their theory about nature. In an approximative way, the real world becomes a ‘possible model’ when we have successfully .managed (computed, measured, controlled) it in terms of our theory; yet much of this success must be judged functionally rather than structurally. Discourse about the real world cannot therefore be a ‘sense data language’ whose statements are ‘absolutely certain’ and ‘decidable by observation’ (cf. 2.1), and the same point must therefore be extended to ‘scientific language’. Here, Hempel’s (1971) proposal to displace ‘observational language’ with an ‘antecedently available vocabulary’ (2.1) is cognitively justified: ‘the “fixed” and “temporally invariant” observational language may be replaced by that pragmatic-historically relativised part of a scientific language whose descriptive symbols are antecedently available terms’ (Stegmüller 1976: 26). This solution can in turn be generalized to referential use of language at large.

2.11. It now seems clear that theories of human cognition and communication must embody ‘general design principles, which appear to be important in the effective utilization and integration of large amounts of intrinsically uncertain knowledge’ (Allport, 1979: 61). We are witnessing a new generation of linguistic theories whose essence is the. application of human knowledge about typical contexts and situations (e.g. Schank et al., 1975; Schank and Abelson, 1977). Even the language material itself sounds, word groups, word definitions, etc. must be utilized in such a way that the evidence in one subsystem can be applied to constrain hypotheses about what is going on in another subsystem (e.g. syntax/semantics/pragmatics) (cf-Woods et al., 1976; Bobrow, 1978; Walker et al., 1978) a principle developed by Woods and Brachman (1978) into the notion of hypothesis-merging. This approach is far more workable operationally than the conventional logical notion of’ disambiguation: computing all possible readings and then reducing them down to the ‘correct one. Without hypothesis-merging, disambiguation is combinatorially explosive, creating an exponential increase of alternative pathways far beyond processing capacities. Successful processing depends vitally upon anticipating occurrences and zeroing in on the crucial cues of the language material. Apparently, linguistic messages can be reasonably well understood without consulting every word (De Jong, 1977; Schank, Lebowitz, and Birnbaum, 1978; Masson, 1979).

2.12. In light of such findings, how can a linguistic theory confront the issues of verification/falsification, application, and empirical content? The answer lies, I think, in the interaction between functional diversification and functional consensus. Human experience places us under pressure to find some means of account and orientation for everyday situations; hence, our theories about the real world must be diversified enough to encompass a reasonable range of variety and complexity. Diversification is heightened in domains where society as a whole must agree in principle about the classification of experience for instance, the legal system demands general agreement about’ cause/effect, intention/accident, the limits of the physically possible, and so forth. In this manner, the prevailing human theory of the real world can be called upon to encompass the collective experiences of the entire society. Agreement can only be reliably maintained by a functional consensus, that is, by a workable correlation between different people’s versions of what is going on and what should be done about it. Whenever human theories about reality are found to be dysfunctional (failing to provide needed account and orientation), they will be tuned, that is, modified in such a way that, satisfactory application becomes possible. Steady usage of a theory should encourage continual tuning, both by the individual and by society in general. Tuning becomes necessary on a large scale when technological innovations (telescope, microscope, radar, etc.) significantly widen the domain that requires account.

2.13. However, technological means may bring an accrual of ‘facts’ which the average human can no longer encompass in a general theory. The usual’ reaction is a fragmentation of theories, where orientation and account are provided for one small subdomain at a time and no clear unified picture of the whole is attempted. Scientific specialization often buys its functional consensus at the price of theory fragmentation. In linguistics, the theory of ‘autonomous syntax’ is an egregious illustration.

2.14. Still, the trend toward functional diversification in the sciences is now unmistakably evinced by the rise of interdisciplinary research, including the attempts to found ‘super-sciences’ like cognitive science, communication science, semiotics, and cybernetics. Researchers have come to realize that fragmented theories are likely to be’ transcended and discarded at a greater rate than otherwise, and that appeals to ‘direct observation’ in one tiny domain cannot withstand the recent results in both philosophy of science and psychology of cognition. The new computers offer the potential for simulating theories of a degree of complexity never before attainable; along the way, the theories must be rendered entirely explicit (though not static), and the criteria for successful application are given in the workability/unworkability of the programmed theory.15

2.15. It is, I surmise, absolutely essential for the diversification of scientific theories that the ‘statement view’ be dropped in favour of the ‘non-statement view’ developed by Kuhn, Sneed, and Stegmüller. Theory fragmentation has been due in part to the attempt to uphold the completeness, consistency, and mechanical decidability of scientific statements about an extremely narrow domain. Diversification is achievable through continual ‘core expansion’ (2.2), and logical rigour can be phased in gradually through the progression from partial possible models to possible models to strict models (2.3).

2.16. This approach will still not eliminate the original threat of circularity or ‘autoverification’ (1.5) between theory and domain. But each time application is made to a new domain, the points where theories enter and the means whereby this happens are structurally and functionally altered. Hence, are actual observations (those dictated wholly by the theory) are not likely to be transferable from one domain to the next. The functionality of the theory is effectively retested in each new domain. Along precisely these lines, William Estes (1979: 47), a memory psychologist, remarks that ‘the measure of success in moving toward scientific explanation is the degree to which a theory brings out relationships between otherwise distinct and independent clusters of phenomena’.

2.17. The proliferation of mini-theories in linguistics (cf. 1.6) might suggest that language is somehow believed to encompass non-uniform domains. Phonology, syntax, semantics, and pragmatics would then each require a separate theory. After all, attempts to generalize the phonological theory of ‘binary minimal features, or the syntactic theory of well-formedness, to other language domains have not been markedly successful. Nonetheless, there are four lines of refutation against the argument of non-uniformity and theory fragmentation:

(a) The ‘macro-logical concept of reduction’ makes it possible to reduce one theory to another ‘even when the theories are formulated in different conceptual languages’ (Stegmüller, 1976: 12). This task requires setting up a ‘reduction relation’ between a ‘reduced’ theory and a ‘reducing theory’ in order to ‘transpose the basic concepts’ of the former into those of the latter, and to ‘map’ the ‘basic laws’ in the same direction; it must be possible to deduce the basic laws of the reduced theory from those of the reducing theory (Stegmüller, 1976: 128). Thus, rigid-body mechanics can be reduced to particle mechanics (Adams, 1959).

(b) All the domains of language form a functional unity whenever language is used. Thus, the differences among domains are obviously offset by some kind of control factors which enable constant interaction and correlation. We could envision a sufficiently high-powered theory (cf. 2.21) where the various language domains are elements of a complex system, such that their material identity is offset by their contributions to the overall system’s operations (cf. 2.7).

(c) It has by no means been shown that the domains of linguistics are not uniform — the most we can say is that current theories make them appear that way. It might well emerge that the subdomains of a language can profitably be seen as control domains under the care of procedural specialists (cf. Brown and Burton, 1975; Kuipers, 1975), such as ‘syntax specialists’, ‘semantic specialists’, and so on (see Winograd, 1972). However, the basic organization of these specialists might be fairly similar in all domains.

(d) The failure to generalize theories of phonology and syntax might well be attributable to the inherently dysfunctional nature of the theories themselves. For example, the phonological approach with minimal features in binary oppositions gives no clues as to how sounds are differentiated or utilized in real-time language operations; I have tried to set down what might be needed to remedy that lack (Beaugrande, 1978). Similarly; a derivational approach to syntax leaves us without a clear means for representing real-time activities of sentence utilization; for example, how is it that people can begin analysis of a sentence before having encountered all of its major categories?

2.18. My conclusion is that a linguistic theory for a science of texts must be treated as a component theory for a more subsuming, integrated theory of human cognition and communication (Beaugrande 1997). Linguistic theory must then be designed such that the results and theories of other domains — e.g. learning, memory, cognition, planning, motivation, social interaction can be interfaced at the appropriate points. This trend is plainly evident in the newly emerging field of cognitive science (cf. Beaugrande, 1980a).

2.19. A consequence of this demand is that theory design leaps into prominence as a major determiner of scientific discovery and progress. The conventional methods of the past called for experimentation to prove or disprove hypotheses drawn from a theory: the design of the theory itself was taken as given rather than negotiable, because verification/falsification was construed as a yes/no forced choice, rather than a subtle gradation. Consequently, testing was mostly quantitative (accuracy or time in task performance) rather than qualitative (critique of theory design). And theory fragmentation was not recognized as a disadvantage - if anything, it assists quantitative testing by simplifying the design.

2.20. Perhaps there might be some ‘normative’ standard; for a research program comparable to that envisioned by Imre Lakatos (1970), but with a different emphasis. We could expound design values to suggest which sort of designs ought to be given preference as a matter of principle. We might include values such as:

(a) A design is preferable if it encompasses more domains than another design (diversification);

(b) a design is preferable if it brings out the similarities in seemingly diverse domains (consensus);

(c) a design is preferable if it requires fewer theoretical concepts than another design (economy);

(d) a design is preferable if it enables a progression from sets and systems to processes, procedures, and simulations (dynamics) (cf. 2.7ff.);

(e) a design is preferable if it allows tasks to be carried out with less expenditure than another design (efficiency);

(f) a design is preferable if its simulation program is more rapid, compact, and general than another design (computability);

(g) a design should make it possible to compare and contrast it with previous, alternative designs (commensurability);

(h) a design should be extendable from the laboratory or computer to socially real situations (ecological validity) (cf. 1.6);

(i) a design should not violate common intuitions without good reason (plausibility ); and

(j) designs should enable empirical research not just of a quantitative nature but also of a qualitative nature such that the theory can be steadily tuned (increasing approximation) (cf. 2.12).

2.21. Values are not useful unless we also have critena that yield a profile for any theory we want to consider; we must correlate normative outlooks with descriptive ones. I shall now suggest some criteria which I have so far found useful for constructing profiles of reading theories (Beaugrande, 1981):

(a) Scale ranges from local (e.g. single sounds, words, etc.) to global (e.g., the ‘gist’ of a whole text or discourse).

(b) Power describes the capacity to apply general typologies of entities or operations to a wide range of occurrences (cf. Minsky and Papert, 1974:59).

(c) Extent of utilization ranges from total (every sound, word, etc.) to extremely fuzzy (only certain strategic elements).

(d) Decomposition deals with whether (and how far) elements are broken down into smaller elements (e.g. word meanings being decomposed into minimal ‘semantic features’ or the like).

(e) Automatization concerns the extent to which operations can be carried out without conscious attention (cf. LaBerge, 1973).

(f) Serial versus parallel processing is decided by providing for either sequential or simultaneous operations, respectively.

(g) Modularity versus interaction depends on whether a theory foresees independent or interfaced components, respectively.

(h) Logical versus operational adequacy hinges on one’s demands that the theory conform to some type of logic or to the requirements for performing actions in a reasonable manner.

(i) Processor contributions subsume whatever the human being is believed to contribute from prior knowledge or experience to a situation or action under study. In fully top-down theories, everything rests on such contributions; in fully bottom-up theories, nothing does.

(j) Processing depth designates the type of task which humans are asked to perform on materials, resulting in particular degrees of effort, involvement, and permanence (cf. Craik and Lockhart, 1972; Mistler-Lachman, 1974; Craik and Jacoby, 1979).

(k) Learning refers to the ways in which the processor properly adapts itself and its strategies during the execution of operations.

(1) Freedom is the range of variations among different human beings that is still allowed to be subsumable under a single theory-concept (e.g. all the versions that would still be acceptable as ‘understandings’ of a given text). .

(m) Openness versus closedness refers to whether a theory is amenable to the admission of further factors beyond those it has chosen to address (i.e., whether limitations are imposed upon the ‘expansions’ of its ‘core’).

(n) Storage capacities encompass the means whereby the processor creates a mental representation of some presented materials and uses it later on for various purposes (e.g. to make a report or summary). In trace abstraction, storage is assumed to contain only traces of the presentation (Gomulicki, 1956). In construction, storage also includes materials contributed by the processor at the time of presentation (Bransford, Barclay, and Franks, 1972; Ortony and Anderson, 1975). In reconstruction, the storage itself continues to evolve, so that when some usage is made, the processor addresses material only in its current state of evolution (cf. Spiro, 1977).

(o) Typology of presentations is the category for specifying a theory according to the materials presented in some context (e.g. a text-linguistic theory that is specified for various text types).

(p) Programming status concerns the extent to which a theory has been written into a computer program (cf. Newell and Simon, 1972; Anderson and Bower, 1973).

2.22. These criteria were used, for instance, to obtain comparative profiles of ten reading theories, as shown in a full table in my report (Beaugrande, 1981, also posted on this website). In some cases, the researchers in question had not taken an explicit stand on certain design criteria. Yet the number of cases where a criterion was in fact not applicable was relatively small. More often, the design of the theory already embodied an unacknowledged decision. And in many more cases, the researcher would agree that certain design features are desirable and eventually necessary, but had not decided how to incorporate them. These findings (which I obtained both by studying the respective literature and by interviewing the researchers in person) suggest that far too little attention has been paid to theory design in the past, and that an explicit discussion would be extremely helpful for both the theorist and the experimenter.

2.23. The debates in the philosophy and meta-theory of science might also be recast in terms of design. The question of ‘theoretical versus observational language’ (cf. 2.1) no longer appears as a dichotomy between separate epistemological worlds, but rather as a set of decisions about design procedures. These decisions should be made explicitly, such that the design values proposed in 2.20 can be openly discussed; and such that theory tuning can be undertaken with the greatest precision. Qualitative testing would be much simpler than in the past because each design criterion represents a range for negotiation: Findings might well have implications for criteria that were not expressly being tested at the moment. We could attain more obvious ‘scientific progress’ because theory design would be clearly traceable through its evolutions. And progress would be more cumulative because design features could be preserved even when a quite different sort of theory arose.

2.24. Kuhn (1970) feels inclined to deny the possibility of cumulative linear progress in science, based on numerous historical examples. Yet he may be emphasizing empirical content too heavily over theory design. The cases where ‘the professional community’ of scientists is seemingly ‘transported to another planet’ via a new theory (Kuhn, 1970: 111) are, I think, far more scarce than those in which a new theory attracts scientists by virtue of its design improvements over the old theory . This factor might explain why scientists will prefer the new one even though, in its initial stages, the new one faces at least as many problems as the old and has fewer successful applications to its credit.16 This kind of design continuity offsets the ‘incommensurability’ of theory content that Kuhn felt impelled to stress so heavily. Indeed, if there were no design continuity at all, most scientists would be unable to recognize the new theory as a theory .in any sense comparable to the one they had used hitherto; there could be no new experiments, no substantive debates, and no full awareness of what is at stake.

2.25. Consider, for instance, the most (in)famous ‘scientific revolution’ in linguistic theory, popularly called (by its adherenta) the ‘Chomskyan revolution’ (even before Kuhn’s work became widely known). No linguist could deny that there was a major ‘paradigm shift’ and a radical change in methodology (Beaugrande 1998). Yet many innovations concerned empirical content, that is, theoretical assumptions that did not derive immediately from the design of the grammar; indeed many of these assumptions were not made until the 1965 version, whereas the design was essentially delineated in the 1957 version for instance, the ‘competence/performance’ distinction and the account of ‘language acquisition’. In contrast, many design features of the new theory were already familiar to linguists:

(a) the sentence as the largest unit possessing grammatical structure was already identified by Bloomfield (1933);

(b) the notions of ‘competence’, ‘grammar’, ‘homogenous speech community’, and other fundamental frameworks used to justify the design were anticipated in the work of the neogrammarians, especially Hermann Paul (1880);17

(e) the notion of ‘transformation’ had been expressly introduced for distributional analysis by Harris (1952);

(d) ‘autonomous syntax’ harkens back to the descriptive structuralist conception of ‘separation of linguistic levels’ (e.g. Bloch, 1948; Trager and Smith, 1951);

(e) the grammatical categories (‘noun phrase’, ‘verb phrase’, etc.) were largely taken over intact from traditional grammars; and.’

(f) the abstract ‘phrase structures’ could easily be reduced to the ‘slot-and-filler’ analysis, say, of tagmemics (e.g. Pike, 1954).

2.26. And most important, as I remarked before (1.7), the specific design , was already fully available in axiomatic logic. For these motives, this ‘revolution’ was essentially a radical change of content superimposed on a background of design continuity, i.e.. a reinterpretation of antecedently available designs. The same point must be made for subsequent sentence theories: the ‘extended standard theory’, ‘generative semantics’, ‘relational grammar’, and so forth. Even the current ‘revolution’ toward ‘procedural grammars” preserves some design features, although with major changes in design organization as needed to become operational. This trend will now be pursued in section 3.

3. Linguistics and logic in a science of texts

3.1. Like van Dijk, I am convinced that the text is a proper object of inquiry for linguistic theory indeed, it is the only such object which can endow that theory with solid empirical content (cf. 1.3). My own arguments (Beaugrande, 1980a; Beaugrande and Dressler, 1981) include the three categories which van Dijk adduces in his (1972) dissertation and which Dascal and Margalit (1974) attacked: (a) methodological, (b) grammatical, and (c) psychological. I briefly reviewed the debate over these arguments in 1.14ff and concluded that Dascal and Margalit’s refutations of some of them can be considered successful only if one accepts the standard theory’s stipulation of linguistic science: to ‘explain our ability to produce and understand a virtually infinite number of sentences of unlimited length’ (Dascal and Margalit, 1974: 105). Empirically, this stipulation is a palpable myth: no human being ever encounters an ‘infinite number of sentences’ in a lifetime, and no sentence is ever of ‘unlimited length’. It follows that a theory of sentences must, among other things, distinguish between abstractly possible sentences and empirically probable ones; only the latter can be discovered by analysing samples or obtaining informants’ judgments (cf. 1.10). Rather than a complete grammar, we should develop a core grammar containing the sentence types (and their formation rules) that speakers of a language routinely utilize (cf. 2.5).

3.2. The methodological arguments advanced by van Dijk centre around his conception of ‘naturalness’ (1.14). ‘A theory T1 is more natural than another T2 if T2 can be reduced to T1 (in the sense of 2.17) without loss of descriptive adequacy, whereas T1 cannot be reduced to T2’ For-example, a sentence grammar can be reduced to a discourse grammar, but not vice-versa (van Dijk, 1972: 22). Dascal and Margalit (1975: 85) combat this preference value on the grounds of ‘a high price in terms of the complexity of its purely theoretical part’. In fact, however, the discourse grammar outlined by van Dijk can reduce complexity by admitting whole new classes of interactive constraints. Consider the famous ‘Waltz effect’ discovered in computer recognition of lines and vertices (Waltz, 1972): whereas the mathematically possible number of interpretations for some vertices runs into billions, the interactive physical constraints upon the vertices belonging to a single consistent object reduce these unmanageable numbers down to mere hundreds (for some vertices, just 10 or 20!). The same effect is now emerging in linguistic theories: the staggering numbers of possible combinations in a context-free grammar become manageable when the constraints of an interactive context are incorporated into a theory. Complexity would explode only if each language domain (syntax, semantics, pragmatics) were reconstructed by a complete, autonomous rule system in isolation from all the other domains.

3.3. It should be added that the standard theory receives a poor rating in terms of the design values set down in 2.20. It fails to provide for diversification, dynamics, computability, ecological validity, and plausibility. Its showing in terms of consensus, economy, efficiency, and increasing approximation has not been impressive. With regard to design criteria (2.21), the standard theory is local in scale, low in power, modular rather than interactive, and inflexible toward learning and freedom. It sacrifices operational adequacy to logical adequacy, for instance, by insisting on total utilization and decomposition. It is not amenable to incorporating considerations of processing depth, storage capacities, and typology of presentations. These values and criteria are not ‘dubious methodological concepts’ (Dascal and Margalit, 1974: 85): they are formatting decisions which any theory will have to face.

3.4. The grammatical arguments advanced by van Dijk include definite/indefinite articles, pronominalisation, and presuppositions. Dascal and Margalit airily hoped that all these issues can be treated in an adequate sentence grammar, and van Dijk almost seemed to agree at one point (cf. 1.14). But I suspect that a far more general and more plausible account of such issues is obtainable in a general text theory where cognitive processes are consulted. Definiteness is then a category for text elements whose status has been clarified in any of several ways: previous mention, constraining definition, uniqueness, default status, prototypicalness, superlativeness, and so on (Beaugrande, 1980a: 137-44); only some of these factors could be singled out by grammar alone, such as previous mention. Pronominalisation is then a means for reducing processing load by using place-holders in surface structure while maintaining the active status of cognitive content in working memory (Beaugrande, 1980: 144-51). Presuppositions’ are open components of active world-knowledge and their status is determinate, typical, or accidental (cf. Beaugrande, 1980a: 71, 139ff., etc.). All these issues can therefore be treated in a sentence grammar only very awkwardly and incompletely, because the grammatical aspect is a by-product dominated by cognitive control (and not vice versa).

3.5. The importance of psychological arguments should now be plain. If van Dijk’s presentation of those arguments is open to attack, then it is only because he viewed this aspect too heavily in terms of formal semantics. The latter domain is much more amenable to treatment in ‘a logic-based sentence grammar (e.g. ‘generative semantics’) than the former. When Dascal and Margalit (1974: 82) single out ‘the properly linguistic facts of texts’, they are in fact designating only those ‘facts’ which their own chosen theory allows them to see and identify.

3.6. I will now pursue this point by focusing upon a quite different type of logical grammar which, I think we can agree, does address ‘properly linguistic facts’ such as word categories and syntactic patterns. In contrast to the logical grammars foreseen either by van Dijk or his opponents in past years, this grammar is operational - in fact, it has already been debugged and run on the University of Texas DEC 10 computer — a legendary machine for all true ‘hackers’.18 On May 2, 1980, it also parsed on a PDP 11 mini-computer, which is not unduly remote from a good home microcomputer.

3.7. Before presenting the grammar, I must stress that it is to be construed as a neutral reconstructive tool for computing operations which were first set down and justified on cognitive and communicative grounds (Beaugrande, 1980a). I designed a text-world model as a conceptual-relational network: a directed graph whose links were assigned labels in terms of cognitive relationships (e.g. ‘affected entity of an action’, ‘location’, ‘purpose’, etc.) (cf. Beaugrande, 1980a: 82ff. for a listing). Such a text-world model qualifies as a ‘partial possible model’ for a formal theory whose design is at present by no means complete.19 The theory entails some claims, such as these: (a) that texts are linear only on the surface, whereas their underlying structures (of whatever kind) are organized in terms of multiple access; (b) that the central priority in using a language is the maintenance of continuity among constituent occurrences, both within a single subdomain and among various subdomains; (e) that successful language operations depend crucially on converging expectancies about what is probable at a given point, and about what to do if something else occurs instead; (d) that grammatical rules should be stated as working procedures for applying the roles to actual tasks (e.g. building or analysing sentences and clauses in real time); (e) that meaning is an ordered set of hypotheses about processing actions over cognitive content in some current context; and so on. As a suitable formalism, multiple access can best be represented in a network format, as opposed to a linear formula. And the test for theoretical hypotheses is best found in task performance.

3.8. In this framework, the logical grammar presented below is a partial candidate theory: its successful application in computer simulation allows us to define the semantic network as a possible model of the candidate theory,’ if we accept the arguments presented in 2.9. However, we must be careful to bear in mind that a logical reconstructive theory should not be misconstrued as an epistemological commitment. A more subsuming theory will have to regulate the correlations between the logical and the epistemological in terms of an elaborate mediation (cf. Petöfi, 1979).

3.9. The grammar comprises a system of clausal logic axioms of the kind developed in Robert Kowalski’s (1974) research, revised as a book in 1979. This system is embodied in a programming language called PROLOG (Roussel, 1975; Warren and Pereira, 1977). The language is in many ways similar to LISP but is founded on a subset of classical logic rather than on Church’s lambda calculus. There exist both PROLOG and LISP versions of the ‘rocket’ grammar, the latter version programmed in Daniel Chester’s HCPRVR (‘Horn Clause theorem PRoVeR) (Chester, 1980). I follow mostly the LISP notation here, since it is easier to read more generally known.20

3.10. Conventions of the grammar include implicit conjunction and disjunction, and implicit universal quantification over all variables. A clause may be an assertion, a hypothesis, or a consequent-antecedent pair. The consequent is true only if the antecedents are true. A hypothesis is proven true if it matches a true assertion or else the consequent of a consequent-antecedent pair whose antecedents can be proven true. In the computational implementation, the set of clauses constitutes a procedure; the consequent is the calling form and the antecedents are the body of the procedures. The hypothesis to be proven is the initial call to the procedure. Since the clauses express relations, one argument can be used to return the result of the procedure’s evaluation.

311. The following definitions are set up, according to Simmons, Chester, and Correira, whose results I follow throughout (cf. Simmons and Chester, 1979; Simmons and Correira, 1979; Chester, 1980; Simmons, 1980). A clause is an expression:

(1) C1...Cm < Al...An, where m, n are equal to or greater than zero.

Here, all C1 and A1 are atomic formulas: the C1 are called consequents and the A1 are called antecedents. The left-pointing arrow < designates that the consequents are found to its left and the antecedents to its right. A clause can include variables (denoted by ‘L’ through ‘Z’) and be interpreted:

(2) For all L through Z, C1, C2...or Cm is true if A1..and An are true.

A clause with no antecedents is an assertion:

(3) C1...Cm < where m is greater than zero

and would be read as:

(4) For all L through Z, C1 or...or Cm.

A clause with no consequents is a hypothesis:

(5) < A1...An where n is greater than zero

and would be read as:

 (6) For no L through Z, A1 and...and An.

With regard to resolution conventions, the symbol < where m and n equal zero, is the null clause.

3.12. An atomic formula is an expression of the type:

(7) (P t1...tn)

 where P is an n-ary predicate symbol and the t1 are terms (i.e., the atom P names the relation holding among the terms t1). A term °is a variable, a constant, a list of terms, or an n-ary function such as

(8) (F t1...tn)

where the t1 are terms. While the symbols ‘L’ through ‘Z’ (with or without subscripts) are reserved as variables, all other atomic symbols (which, depending on the location in clauses, may be predicate or function names) are interpreted as constants (cf. Kowalski, 1974).

3.13. Lists are defined according to the conventions of LISP. For example, the list of constants

 (A B C (D ))

is a list whose topmost elements are A,B,C, (D E). If the complete dot notation of LISP is applied, where dots are pointers, we have:21

 (10) (A.(B.(C.((D.(E.NIL)).NIL))))

Any variable, such as X, matches the entire list (in complete dot notation, X would be (X.NIL), thus matching only expressions of the form (A.NIL), ((A B) .NIL), and so on. Suppose we match a dotted pair list of variables like (X.Y) to the list (9); we obtain the values:

(11) X: = A, Y: = (B C (D E))

The dotted list (X Y Z.W) would be bound:

(12) X: = A, Y: = B, Z: = C, W: = ((DE))

The non-dotted list (X Y Z (X1X2)) also matches and is bound as:

(13) X: = A, Y: = B, Z: = C, X1 : = D, X2: = E

The list conventions of LISP as illustrated above are useful in matching two clauses. The LISP function CAR accesses the first list elernent, and CDR the remainder (thus, in (X.Y), X refers to the CAR and Y to the CDR). In addition to these data selectors, constructors (CONS) are provided to concatenate two lists as a pair or dotted pair.

3.14. The Horn clause can be used to avoid disjunctive consequents. The Horn clause is limited to a single atomic formula in a consequent. Kowalski (1974) thus obtains a procedure that both describes and computes where one list is a subset of another. Consider this case: X is a subset of Y if for all Z, if Z is a member of X, Z is a member of Y. In Horn clauses, our definition of the subset would be:

 (14) (SUBSET NIL NIL) <

(SUBSET NIL (Y.Y1)) <

(SUBSET X Y) < (SPLlT X Z X1) (MEMBER Z Y) (SUBSET X1 Y) (SPLIT (X.X1) X X1)) <

(MEMBER X (X.Y1)) <

(MEMBERX (Y.Y1)) < (MEMBER X Y1)

In English, this program would read: (1) NIL is a subset of NIL and every list; (2) X is a subset of Y if the list X is split into Z and X1 , and Z is a member of the list Y, and X1 is a subset of Y; (3) to split a list (XX1), return X and X1 ; and (4) X is a member of a list (U.Y1) if X equals U or, if X is a member of the remainder, Y X1’.

3.15. Consider now how the above procedure might be called by typing in:

 (15) < (SUBSET (A B) (D E A B))

 The left-pointing arrow < signals that this message is a hypothesis to be proven. It results in the successful match with all assertions and consequents:

 (16) (SUBSET X Y) <...

The X matches the list (A B), while Y matches (D E A B). These values are bound through the corresponding uses of X and Y in the antecedents,

 (17) (SPLIT X Z X1) (MEMBER Z Y) (SUBSET X1 Y).

Therefore, the bindings are (SPLIT (A B) Z X1), This antecedent is now entered as a hypothesis to be proven and is matched to:

(18) (SPLIT (X.X1) X X1) <.

The unification yields:

(19) (SPLIT (A B) A (B)) <

whose values are returned and bound through (MEMBER Z ...). The Z in the antecedent SPLIT was bound to the constant A, so that (MEMBER A (D E A B)) is in fact the new binding of the next antecedent. MEMBER operates as a recursive test of whether the first argument matches the top of the second or is a member of the latter’s remainder. If A is indeed a member of (D E A B), the recursive call to SUBSET becomes completely bound:

(20) (SUBSET (B) (D E A B))

and is taken as a new problem. When B is found to be in the subset, the final recursive call is

(21) (SUBSET NIL (D E A B))

which matches the following to end the entire procedure:

(22) (SUBSET NIL (Y. Y1)) <

 3.16. We can see that the clausal logic proposed here is a straightforward backward chaining applied with recourse to lists and variables. We can now consider its use in transforming the sentences of an English text into a seman­tic network, or from the network back to the sentences. The sample text reads:22

(23) A great black and yellow V-2 rocket forty-six feet long stood in a New Mexico desert. Empty it weighed five tons. For fuel it carried eight tons of alcohol and liquid oxygen.

Everything was ready. Scientists and generals withdrew to some distance and crouched behind earth mounds. Two red flares rose as a signal to fire the rocket.

With a great roar and a burst of flame the giant rocket rose slowly and then faster and faster. Behind it trailed sixty feet of yellow flame. Soon the flame looked like a yellow star.

In a few seconds, it was too high to be seen; radar tracked it as it sped upward to three thousand mph. A few minutes after it was fired the pilot of a watching plane saw it return. It plunged to earth forty miles from the starting point.

3.17. To begin, we can use only the simple sentence: The giant rocket rose’. We want to represent it as a network of nodes and links, with a label for each link. Following and adapting my system of links (Beaugrande 1980a), Simmons and Chester (1979: 9) obtained the network shown in Figure 1.23

In LISP notation the network appears as a bracketed string, thus:

(24) (RISE TNS PAST AE (ROCKET NBR SING SIZE GIANT DET THE))

A logic grammar which can transform the sentence into the network has assertions for dictionary entries, and consequent-antecedent clauses for what is expressed in conventional linguistics as phrase structure roles plus transformational rules. Thus, our lexical assertions for the sample are:

(25) (ART THE)

(ADJ GIANT) <

(NOUN ROCKET (ROCKET NBR SING))

 (VV ROSE (RISE TNS PAST)) <

where the left-pointing arrow < marks each one as an assertion (cf. 3.11).24

3.18. The syntactic structure of the sentence would appear in a conventional linguistic tree as shown in Figure 2.

The lexical assertions in (25) can serve as rules for the four terminals of the tree. Yet we also need rues to capture the semantic relationships between any pair of constituents. Simmons (1978: 458) proposes a four-tuple rule form for sentence parsing:

(26) (NAME, SYNTACTIC-COMPONENT, SEMANTIC-TEST, TRANSFORMATION)

Taking the Horn-clause notation:

(27) (S X (V2 WV1)) < (NPXV1 R) (VP R V2 NIL)(ARCNAME V1 V2 W)