In Ad Hermans (ed.), Les dictionnaires specialisés et l’analyse de la valeur. Louvain-la-Neuve: Peeters, 1997, 57-74.


Text Linguistics, Discourse Analysis, and the Discourse of Dictionaries


Robert de Beaugrande



A. Dictionaries as Discourse?

The title of this paper may strike an unaccustomed note in the sense that a dictionary is not typically conceived to be discourse or a discourse type. Instead, it is widely regarded as a type of list or listing whose organisational principles differ substantially from discourse in the everyday sense. The discrepancy recedes, however, if we define discourse not as an artefact of language based on the model of everyday conversation, but as any communicative event among participants (see Beaugrande in preparation). We then shift our focus from dictionary as a tangible artefact of paper and ink over to the compilation and the use of dictionaries as communicative occasions occurring under characteristic circumstances.

In view of the social, intellectual and linguistic importance of dictionaries it might be considered curious that they have received so little attention in their own right. The modest statements of principle and method occurring at the beginning of most commercially published dictionaries are  seldom read by the users, as if there was nothing in any way complicated or problematic in the achievement which the dictionary represents.

No doubt this widespread disinterest reflects the folk wisdom that words are relatively reliable and stable units whose meanings can readily be captured in an orderly list of definitions. The same folk wisdom would hold that discourses are simply combinations or series of such units whose significance in turn derives from a corresponding combination of their meanings as represented by those definitions.

Among the leading discoveries upon which a whole series of intellectual and scientific trends have converged since the middle of the 20th century has been the acknowledgement that relationships of words to meanings, and of texts to significances, is considerably more elaborate and problematic than this folk-wisdom account could remotely suggest — the cognitive revolution spanning cognitive psychology, linguistics, psycholinguistics, cognitive linguistics, critical linguistics, along with social psychology, rhetorical psychology and most of the more significant trends in literary theory, including Marxism, feminism, post-structuralism, and deconstruction. Despite their extraordinary diversity, all of these trends concur that conventional accounts of meaning, ranging from the philosophies of antiquity up to medieval scholasticism and finally into the increasingly formalised semantics in our own century have woefully underestimated the inadequacy of stable correspondences between words and meanings to account for the phenomena of human communication and interaction. It is now generally agreed that meanings are not conveyed from person to person by words and sentences the way that building blocks might be passed along from hand to hand (or from mouth to ear), but that meanings are actively and jointly constructed, negotiated, and adjusted during the actual communicative event.

On the face of it, such assertions might appear quite ominous for the enterprise of dictionary-making. Inevitably, dictionary-makers are required to abstract, by whatever methods, across large classes of events and attempt to determine what aspects or elements of a meaning are sufficiently common or shared as to merit inclusion in a dictionary definition. Dictionary-makers can either proceed as usual and give no attention to the intellectual trends I have cited above and to any others of a similar import; or they can rethink the processes of constructing and using dictionaries in the light of these recent developments. In this paper, I shall adopt the latter course using as my frame of reference the domains generally known as text linguistics and discourse analysis naturally as determined by my own views as I have attempted to set them down in a large-scale study still in progress at this time (Beaugrande 1997).

B. From Conventional Semantics to Discourse Analysis

Historically, discourse analysis grew out of several disciplines, chiefly linguistics on the one side and anthropology plus sociology on the other. The two sides differed, not surprisingly, in the extensive focus to which they placed upon language as opposed to the other factors involved. On the side of linguistics, whose preoccupation with language went to the extreme of attempting to abstract it out of the context of ordinary communication, discourse analysis was primarily prepared by the methods of fieldwork linguistics. The situation of doing fieldwork naturally keeps the fieldworker in a continual engagement with the cognitive and social production of meanings, even when the official goal of the enterprise is still a description of, say, ‘morphology’ or ‘grammar’. This continual engagement enables fieldworkers to draw powerful and well-supported conclusions about the meanings of words or utterances even when the methodology and theoretical framework for drawing such conclusions has not yet been supplied. For want of a better term, we can call this data-driven semantics, in the sense that hypotheses and conclusions about meanings are continually being generated and tested. Should these be inaccurate, the fieldworker will soon encounter difficulties in his or her attempts to participate in the discourse of the community such as being misunderstood, or giving rise to unintended merriment.

On the other side, we have what can justly be termed theory-driven semantics, which works from the top down by postulating an essentially artificial framework, such as formal logic, purported to supply the wherewithal for definitions of meaning. Apart from the obvious differences in procedure, such as the construction of elaborated schemes of "semantic features", the most important contrast between this method and the ones supported by fieldwork is that hypotheses and conclusions are not subjected to any similar social and cognitive testing. Typically, the only source of opposition or correction comes from other theory-driven semanticists who may or may not be in sympathy with the theoretical framework but who all share the tendency to argue on largely intuitive grounds based on their personal assessments of what words or sentences might mean apart from how they have been observed in realistic situational contexts. As far as I can discover, this tendency is typical of most of conventional semantics in the "mainstream linguistics" that has dominated the agendas of professional journal's conferences departments has been of the latter type, as indicated for instance by such general surveys as Lyons (1977).

The outcome has been widespread stagnation in which semantics has failed to progress beyond a handful of stock examples, many of them artificial constructions which would be quite unlikely to exemplify ordinary communication. In some of these discussions, as I have shown (Beaugrande 1984), the question of dictionary definitions do surface but usually not as a central issue on the agenda of semantics. Instead, semantics has usually raised the prospect of an enterprise with a much more theoretical and less practical nature than the enterprise of producing dictionaries. This prospect no doubt reflects the generally top-heavy and theory-driven character of mainstream linguistics in general but has created an unfortunate deficit in the potential interactions between semantics and lexicography, as the discipline of dictionary production has come to be called in pointed opposition to lexicology, the study of word meanings as such.

From my own perspective, a particularly troublesome issue in this regard has been the disinterest of conventional semantics in the meanings of actual utterance in realistic communication. Typically, such meanings are treated, if at all, only as rather vague or indeterminate approximations of the formal semantics ideally represented by the full precision and determinacy most characteristically mirrored in the schemes of semantic features. When a well-known linguistics journal attempted to compile a special issue on ‘text-semantics’, the only result was the publication of my own contribution in a regular issue; the editor informed me that no other publishable results had been submitted. Without having any further information about the other potential or actual contributors, I would surmise that the most important obstacle in this as in many other areas was the widespread notion that a semantics of texts can be based on semantics of sentences or even a semantics of words in an essentially combinatory fashion under the leading principle that the meaning of a text is simply the sum of the meanings of its sentences or words. This view has proven remarkably persistent despite important research showing the inadequacy of the accounts using an approach to the whole as the sum of its parts.

Suppose for the sake of argument that we adopted a sample text of a limited length and relatively straightforward character that is without any obvious problems or ambiguity of the type that would be obvious candidates to defeat precise analysis. And let us suppose also that by dint of considerable hard work we had compiled an exhaustive analysis of the meaning of all of its words into some set of semantic features. What predictions could be safely made about the results? Two seem readily evident. First, our description would be enormously unwieldy containing large numbers of features that would be either redundant with respect to other features in the same description such as the well-known semantic feature "+ animate" that would need to be respecified each time an animate agent is selected, say as the subject of a verb; or else would be disturbingly arbitrary in having no particular generality, of which Fodor and Katz's "without a mate at breeding time" readily pops to mind as part of the meaning of "bachelor" when applied to a seal as opposed to a human being. We would thus have not merely a vast clump of trees with no forest but a clump with some rather abnormal trees that do not grow anywhere out in nature.

The second prediction would be that this profusion of data would still be missing some important aspects of the meaning of the text, such as its organisation into "themes" or "topics" — i.e., those semantic control centres which are so important in determining which meanings are the relevant ones in frequent cases when words or collocations appear which might have several definitions in an ordinary dictionary.

We should call to mind here how often conventional semantics has been argued on the basis of isolated sentences which, from a discourse standpoint, constitute artificial disturbances in communication, such as "the bill is large" to quote another notorious example from Fodor and Katz (1963). That conventional semantics should have a pronounced concern for the elimination of ambiguities is understandable in view of the mainstream linguistic notion of describing ‘language by itself'; but I have yet to see a convincing demonstration that the heavy mechanics introduced for disambiguating artificial sentences of this type is a genuine model for comprehension of meanings in realistic discourse, where people rarely are intending to create artificial ambiguities, except possibly for humorous effects. The relation of semantic analysis to human operations of understanding has typically either been left a moot point, or else glossed over with some optimistic handwaving by claiming that people  really do understand meanings in the same ways as semantics analyses them, for example by the so-called ‘relevance theory’.

One of the major motivations for both text linguistics and discourse analysis — more cautiously for the former movement than for the latter — has been the burgeoning acknowledgement that a science of text or discourse cannot make much significant headway on the assumption that the whole is the sum of its parts, whatever these are taken to be by a particular model. In current parlance, a balance has been sought between the essentially "bottom-up" mode of description that isolates and describes individual units and a "top-down" method that postulates overarching organisational patterns such as the "macro-structures" of Kintsch and van Dijk (1978). Not surprisingly, the respective contributions of the bottom-up and the top-down perspectives to the actual production and comprehension of discourse has remained an issue of lively dispute, with the linguists typically coming in heavily on the bottom-up side and the cognitive psychologists and artificial intelligence researchers more on the top-down side. Along the way, a number of rather surprising findings have fostered some abrupt shifts. Without going into the more technical details, I will sum up just three of these findings which appear to me particularly relevant to both the theory and the practice of lexicography.

The first of these comes from cognitive psychology, namely from unexpected but robust experimental findings on ‘priming’ in text reception during reading. A probe item such as a word is held to be primed if its degree of activation in memory is raised above the inactive state, as if standing at attention and waiting to be called. Primed items will be consistently recognised and responded to more rapidly than others, e.g., by pressing a key to signal that it either is or is not an English word (a ‘lexical decision task’). Surprisingly, the experiments indicated that when a word is recognised, all its meanings are initially activated, not just the relevant one. Yet after a short time the non-relevant ones are ‘deactivated’, while the relevant ones raise their ‘activation’ and ‘spread’ it to further associates. Suppose you are a speaker of American English reading a text on a moving computer display containing this passage:

[1] The townspeople were amazed to find that all the buildings had collapsed except the mint.

The text suddenly halts at ‘mint’, and the display gives you a ‘target item’ to decide if it’s a real word. For a brief interval up to roughly half a second, your response would probably show priming for both the relevant ‘money’ and the non-relevant ‘candy’, but not for the inferable ‘earthquake’ (what made the ‘buildings collapse’). Thereafter, the non-relevant item would lose its activation while the relevant and the inferable items would gain. Evidently, the constraints of co-text and context exert their control during this tiny interval and a series of cycles using excitation and inhibition to regulate the strength whereby any one word or meaning is associated with the rest, e.g. though a shared topic.

This finding projects an image of the brain resembling a dictionary in the sense that whether a concept is activated or a word is looked up all of the available meanings become equally accessible, and that in both cases, the relevant meaning rapidly becomes the centre of attention. But there is also a fundamental difference, namely that most dictionaries follow methodical principles for listing meanings in a particular order, whereas non-selective activation of meanings indicates a principled absence of preferential ordering. However, if we pass from the activation for individual items over to active network of context, which is from a psychological standpoint obviously the operational base that does the rapid sorting of meanings, then it could very well be a contextual reshuffling of preferences, rather like a dictionary whose definitions are continually being reordered according to what would be considered more important or central ones in a discourse domain, such as the discourse of geometry. Though such a dictionary would be impossible in paper copy, the prospects for an electronic version functioning in such a modality are no longer unduly remote as we shall see later on.

The second finding comes from the field of artificial intelligence, which includes language and language understanding among its broad spectrum of interests. Unlike linguistic and semantic theories, a model in artificial intelligence must pass the rigorous test of functioning in an operational setting, such as accepting a story as input and then answering questions about it or making a summary. A number of the complicated formal systems proposed by linguists signally failed this operational test. However, the question of how far such a program can be said to understand the meaning of the text it processes has remained irresolvably controversial. A number of early natural language programs simply finessed the issue by using superficial rules to rearrange words into strings that resembled meaningful utterances but which depended on the comprehension processes of the programs user rather than of the program itself.

A radical departure from the usual methods came from an approach known as parallel distributed processing (or ‘PDP'). Instead of manipulating words and phrases or their meanings as whole units this approach uses network representations whose meaning is determined by the relative strength of links among nodes. This method has been termed "subsymbolic" in the sense that it operates with units that are in principle not meanings but elements which are organised to constitute meanings as they are needed. At least one model of this type has been constructed and successfully tested for applying a scheme of semantic cases, namely the ‘role assignment model of McClelland and Kawamoto (1986). For a sample like

[2] The bat broke the window

the ‘case frame’ competes between ‘bat’ as Instrument  (for playing cricket) and ‘bat as Agent (flying mammal) (1986:305). Context is searched for evidence to strengthen the trace of some features and to weaken others, raising the probability of converging on the correct meaning.

We might see some interesting parallels between this work and that cited above for cognitive psychology. In both cases, an active network is postulated to be responsible for the on-line organisation and construction. Thus, more complex meanings need not be stored but can be adapted and adjusted to contextual requirements — clearly a desirable feature for a model of discourse processing and an attractive framework to reinterpret typical semantic feature schemes in a more dynamic and realistic mode.

On the other hand, these PDP models do not bear any particularly suggestive resemblance to dictionaries or dictionary uses. On the face of it, it seems hard to imagine what a "subsymbolic" approach to lexicography might look like. Of course, lexicographers are well aware that meanings can very well be regarded as parts or components of other meanings although it may be unclear how far this vision of smaller parts and larger parts is to be understood metaphorically rather than substantively – an issue of more theoretical than practical significance.

The third development comes from the domain of large corpus linguistics, which has emerged from the availability of very large computer-sorted corpuses of authentic discourse data. Within linguistics, its most obvious ancestors are fieldwork and discourse analysis, for which it provides valuable support in enabling more rapid and convenient surveys of larger ranges of actual data than were possible with those two earlier approaches. In contrast to the first two developments I have cited, this one is directly relevant to the concerns of lexicography and is indeed already well on its way to profoundly transforming it. The success of the Collins COBUILD English Language Dictionary is an obvious signal, although the ways in which it differs from previous approaches to lexicography are less obvious (cf. Sinclair 1992).

One major implication is that the staid division into ‘levels’ apart from the lexicon is no longer strategic. For example, the verb ‘brook’ is one of the many lexical items found to have a clearly defined set of constraints on its usage (Sinclair 1994). Grammatically, it takes negation (usually ‘not’ just before it or ‘no’ just after), and seldom appears with the first or second person singular as the subject. Semantically, its direct object is a concept associated with opposition, interference, or delay. Pragmatically, the subject must be in a position of sufficient authority to carry the performative perlocutionary force entailed in declaring what it will not ‘brook’.

Such data suggest that the various ‘levels’ of linguistic description can  be exploited more richly and directly for lexicology and lexicography. An elegant solution also impends for the perennial problem of deciding which usages of particular lexical items or collocations of item deserved to count as representative and hence worthy of inclusion in dictionary definitions. Researchers have repeatedly discovered that the intuitions even of native speakers may prove unreliable in the face of actual data. In compensation, scanning a set of representative data reliably leads to a consensus much more readily than any of the proposed analyses of meanings into ‘semantic features’ or similar schemes, where the lack of consensus has been a significant obstacle to progress even for seemingly simple cases.

No doubt, this third development, the emergence of large corpus linguistics has done more than any others to put lexicography back in the spotlight of language study after it had been obliged to lead a somewhat shadowy existence along the margins of conventional linguistics and even of conventional semantics. Today, lexicography as a practical concern is increasingly driving theoretical deliberations on topics which had long been considered closed, such as the relative independence of descriptions in morphology, syntax, and semantics. Most significantly, semantics may finally be released from the relatively windless enclosure into which it had been introduced by formalist schemes for analysing meaning independently of discourse.

C. A ‘new lexicography’?

Could a reassessment of dictionaries as a type of discourse point to a ‘new lexicography'? It seems evident that dictionaries compiled from large corpuses of discourse would be significantly different from the usual traditional and intuitive methods.  The latter kind always remain at a certain unspecified distance from utterance data, so that the accuracy of the definitions entails a margin of uncontrollable variants among lexicographers. Of necessity, these lexicographers were all significantly influenced by the practice-driven exigencies of their work rather than by abstract speculations about the nature of particular meanings. Rather than saying that they were simply not ‘theoretically oriented’ at all, it might be more to the point to say that they relied chiefly on implicit theories specified by their practice but not explicitly prescribed by authoritative formal research. Just as functional linguists and ethnomethodologists contend that the discursive practices of participants in ordinary conversation represent some ‘underlying theory’ about how language and discourse operate, so too would the practices of lexicography albeit in a substantially more disciplined method.

As long as ‘mainstream’ linguistics was intent upon circumscribing its boundaries on a plane sufficiently abstract to be removed from what it regarded as the grainy and largely adventitious details of ordinary discourse, the relationship between lexicography and linguistics naturally remained unsettled and the deliberations by linguists on the nature of the "lexicon" typically proceeded on quite different assumptions and principles than those which were applied to the production of dictionaries. To put none to fine a point upon the matter, the major difference was simply that the linguists were under no pressure to produce tangible results, but could argue endlessly over finicky problems and handfuls of artificial examples

 Today, the shoe appears to be on the other foot, with lexicography moving toward the centre of activity and interest, tanks to the significant advances enabled by large corpuses. In the process, the conventional linguistic and semantic concept of the "lexicon" is undergoing energetic reassessment: no longer as a system of abstract rules for arranging features in matrices in order to generate and describe potential meanings, but a range of strategies for constructing, adjusting, and negotiating meanings in the context of human sense-making activities during discourse.

How might we characterise the ‘new’ dictionary as a type of discourse? The obvious alphabetical procedure of listing renders it a necessarily episodic discourse type, that is, a discourse with a multitude of brief episodes whose relation to adjacent episodes is far less systematic than would be the case in most other types of discourse. In some cases, a single episode would cover no more than one item plus one definition; in others it would cover a set of mutually related definitions. Typically, the cross-references from definition to definition are only co-incidentally related to the linear ordering of the definitions. Exceptions would be largely etymological, where one lexical base has yielded several items having related meanings and also standing proximate in alphabetical sequence, e.g. ‘inherit’, ‘inheritable’,   ‘inheritance’, ‘inheritance tax’.

Less obvious is the way that every definition cross-refers to other items in the dictionary, namely those which are used to formulate the definition itself. When a dictionaries put into computer-readable format and indexed as a concordance, as was done at the computer science department of the University of Texas at Austin in the 1970s, it became possible to compare the relative frequency with which certain items figured in the definition of others, with many pages of listings for items like ‘thing’.

The syntax or grammar of dictionary definitions remained for a long time relatively specialised in respect to other discourse types. In each definition, the head item would be followed by some notation indicating its pronunciation and some grammatical information such as supplied by parts of speech scheme. Thereafter, the definition would be formulated in fragmentary phrasings rather than complete sentences, many times following conventions quite specific to the dictionary A passage like ‘funicular: of, relating to, or being a funiculus’ could come from no other discourse type than a dictionary (Webster's Seventh New College Dictionary, p. 339).

As lexicography was modernized, the conventional solution has been to divide the task of writing definitions among people whose discourse practice has been gained in various specialisations, so that their assessments should be more authoritative and representative than otherwise. But specialist are not necessarily considerate of non-specialists and may write definitions that require ordinary uses to page about collecting definitions of the terms e.g., when ‘cosecant’ is defined as ‘the secant of the complement, or the reciprocal of the sine, of a given angle or arc’ (Webster's Random House College Dictionary, p. 307).

The most significant innovation in English lexicography is once again the COBUILD series, which offers all definitions as complete sentences, typically in a conditional form, e.g.(Collins COBUILD English Language Dictionary, p. 815):

[3] If you say ‘you've got to laugh’, you are saying that you can see the amusing side of a difficult situation rather than being sad or angry about it.

[4] If you say someone ‘is laughing all the way to the bank’, you mean that they are making a lot of money very easily and feel very confident.

The motive for this innovation was to formulate definitions that are expressly similar to ones ordinary people might give during a conversation, say to a child or a foreigner who is still in the process of learning the language. This innovation bring dictionary definitions much closer to ordinary discourse, which is also the source of the corpus data, than earlier definitions illustrated a moment ago.

 What of the semantics of dictionary discourse? More prominently than conventional linguistic theories suggest, semantics is omnipresent: special care is expended upon selecting items that are semantically relevant for each individual episode. The episodic quality naturally impedes the formation of topics, themes, or macrostructures. In return, each episode is constructed on the hypothesis that for any given item a definition can be formulated such that its total significance will be equivalent to the total significance of that item in at least one of its representative uses. This hypothesis rests ultimately on the conception of synonymy, which has long been regarded as one of the most stable and central conceptions in semantics. One term can be defined by its synonym, e.g.: ‘eerie: strange, mysterious; syn [synonym] see weird’ (Webster's Seventh New College Dictionary, p.263)

Yet large corpus data show that actual synonymy is quite rare in the sense that virtually no two lexical items collocate in precisely the same way. Complementarily, the large corpuses show that synonymy might be strategically replaced by a concept such as ‘mutual collocability': the potential of formulating collocations (in the definition) which suggest sequences that can collocate in corresponding contexts, e.g.:

[4a] they are laughing all the way to the bank

[4b] they are making a lot of money very easily and feel very confident

Admittedly, collocations like [4a] and [4b] often differ in frequency and probability,: the collocation to be defined is more stable, while the collocation of the definition has been made to order.

The new type of definition thus has two parts: (1) a statement of the meaning of an item and (2) some directions for collocating the item; the second part was missing from such conventional definitions as the one for ‘cosecant’ cited above. The traditional demand of some semantic theorists that a synonym supports a definition only if it could stand in the place of the lexical item in a sentence (e.g. Wiggins 1971), is no longer relevant here, now that large corpus data have suggested that any neutral substitutability in context is, at best, a partial one.  For example, ‘serious concern’ collocates pejoratively, whereas ‘serious consideration’ collocates amelioratively, even though we seem to have the same word ‘serious’ and two words with similar definitions, ‘concern’ being ‘marked interest or regard’ and ‘consideration’ being ‘continuous and careful thought’ (Webster‘s Collegiate Dictionary, pp. 172, 178), and ‘showing concern’ for somebody resembling ‘being considerate’ of them. But we actually have here two meanings of ‘serious’: one in such collocations as ‘serious problem’, i.e. ‘grave’, and the other in such collocations as ‘serious intention’ i.e. ‘sincere’.

By the same token antonymy, another central concept of conventional semantics, no longer holds any particular privileged place, but is reassigned a role as an aspect of the explanatory discursivity taking advantage of the ‘dialectical’ convention of defining things in terms of what they are not. For corpus-driven dictionaries, this convention has only a limited range. For example [5a], though logically precise, is a poor definition because it assumes you know the because it assumes you know the meaning of the base item (‘tidy’), which would make it unnecessary to look up ‘untidy’; [5b], though logically imprecise, is much better because it refers to items whose meanings a user might well know:

[5a] something that is untidy is not tidy

[5b] something that is untidy is messy and disordered and not neat or well-arranged (Collins COBUILD English Language Dictionary, p. 1604)

We see here what I have long suspected is a general problem of logical semantics: logical precision must be paid for by sacrificing usefulness.

The issue of usefulness brings us to the pragmatics of the discourse of dictionaries. For conventional dictionaries like those cited in section B, usefulness was determined by rather diffuse criteria. The ambition was evidently to attract as wide a range of users as possible, ranging from quite general over to quite specialised. The specialised definitions were not just written by specialists but also intended for would-be specialists, e.g. who know what ‘secant’ and ‘sine’ mean. Since such definitions are hard on general users, the lexicographers apparently assume that people tend to stay within their own fields — hardly the case nowadays.

The new corpus-driven dictionaries put the problem in a more realistic light:

many words of technical origin in current use have highly specific meanings which are not really accessible to anyone who does not know the subject. They are explained, so to speak, within a scientific or humanistic discipline. If we just wrote out the ‘official explanation’, our users would hardly be helped at all. (Collins COBUILD English Language Dictionary, p. xix)

Three approaches are adopted to resolve the problem. First, many specialized terms, such as ‘cosecant’ and ‘secant, are too infrequent to make the cut-off that qualifiers an item or collocation for inclusion. Second, having a corpus as the source ensures that ‘the meanings given are the meanings that are actually used in our ordinary texts and not necessarily what a specialist would say’ (ibid.). Third, the ‘technical words’ that are included ‘are explained’ ‘according to the way we use them in ordinary English’ (ibid, p. xx). The differences between this outlook and conventional dictionaries can be seen by comparing definitions like these:

[6a] gyroscope: a wheel or disc mounted to spin rapidly about an axis and also free to rotate about one or both of two axes perpendicular to each other and to the axis of spin so that a rotation of one of the two mutually perpendicular axes results from application of torque to the other when the wheel is spinning and so that the entire apparatus offers considerable opposition depending on the angular momentum to any torque that would change the direction of the axis of spin (Webster's Seventh New College Dictionary, p. 372)

[6b] A gyroscope is a device that contains a disc rotating on an axis that can turn freely in any direction, so that the disc maintains the same position, whatever the position or movement of the surrounding structure.(Collins COBUILD English Language Dictionary, p. 699)

[6a] was obviously written by a specialist so anxious to get all of the mechanical details of construction and operation just right that result is barely readable. Moreover, other technical terms create further obstacles. Most people would at least have to look up ‘torque’, and would find this forbidding definition:

[7] torque: something that produces or tends to produce rotation and torsion and whose effectiveness is measured by the product of the force and the perpendicular distance from the line of action of the force to the axis of rotation (ibid. p. 934)

It is utterly unlikely that most people could plug the definition  [7] into [6a], which is already overloaded, in order to arrive at a reasonably clear understanding of the meaning of ‘gyroscope’. [6b], in contrast, gives us the salient point, namely the ability to ‘turn freely in any direction’ on an ‘axis’ paradoxically enables ‘the disc to maintain the same position’ — which [6a] had obscurely portrayed as ‘offering considerable opposition’ ‘to any torque that would change the direction’.

D. Conclusion and outlook

Simplifying somewhat, I have suggested that the development of lexicography could be described in four stages that are defined not so much chronologically as methodologically. Traditional lexicography has been almost entirely practice-driven, albeit close examination indicates some measure of implicit theories inherent in the practices, e.g. about the types of sources that should be considered authentic or representative. As this tradition entered its modern period, increasingly large and diversified groups of specialists were assembled for a spectrum of lexicographical domains, noticeably those represented by the more prestigious sciences, like physics, chemistry, and biology. A noteworthy trait of this  ‘modernization’ is an increasingly stringent stance toward non-specialists, as documented by definitions of a lexical entry in terms that are sometimes equally specialized and sometimes even more so.

Another stage might be identified when ‘lexical’ issues were gradually registered as concerns for the interconnected disciplines of modern linguistics and semantics. For reasons I have suggested, conventional linguistics was generally rather reluctant to invest significant initiatives in the area of lexicography, but preferred either to concentrate on morphology and syntax or to construct a rather differently defined domain of lexicology, whose theoretical groundwork was typically remote from the practical concerns of lexicography. I have suggested also that many issues of lexicography were judged too specific to be of general theoretical interest, especially or the mainstream project of defining ‘language by itself’. An exception to this otherwise arid picture has been the register studies sponsored by functional linguistics, which raised the interesting notion of the ‘lexicon’ being ‘the most delicate grammar’ (Halliday).

Yet another stage, this one fairly recent, was inaugurated by the introduction of large computer corpuses of data as an active base for constructing dictionaries, Because lexicography was the chief motivation animating the design of such corpuses -- dictionary publishers provided crucial funding in early stages -- the special needs of lexicography received much more concentrated attention than they had in the Stage I described a moment ago. In return, the insights made available by corpuses have rapidly spilled over the borders of lexicography, proffering significant impulses for reassessing the design of linguistics or of some future science of language with a more transdisciplinary cast.

The final stage I would identify is only emerging now, so that my remarks can best be understood as tentative proposals or anticipations that would be desirable from the standpoint of text linguistics and discourse analysis as well as from the standpoint of broad issues of cognition and communication. I have suggested that lexicographers should consider how to guide and support the discourse strategies of respective user-groups as the basis for providing more detailed information about usage. Some steps have been taken in this direction by signaling which items might be considered ‘formal’ or ‘informal’, ‘current’ or ‘old-fashioned’ and so on. What remains to be done is to identify the relative degrees of specialization among different terms or indeed, among different uses of the same term. Only then can user-groups be served who also differ in their degrees of specialization. An electronic dictionary could be provided on CD-ROM that automatically reshuffles the order of definitions to suit a user in one designated special area. This process might resemble the operation of memory, as recently discovered in cognitive psychology.

Such projects would have a healthy fall-out on the discursive awareness and discursive practices of groups of specialists communicating with groups of specialists in other fields or with general audiences. Terms widely identified as highly specialized in lexicographical works would surely be handled with more deliberate care in specialized discourse than would otherwise be the case.

I would see here a significant opportunity for lexicography to make new contributions to efficient communication, well beyond the ones already enabled by conventional practices. It would no longer just be taken for granted that non-specialized users are obliged to do the work of looking up terminology; instead, terms would be contextualized and collocated in the discourse to support easy access. At that point, discourse in society would be much closer to the discourse of dictionaries such as the COBUILD.

Perhaps the prospects raised for this final stage might appear unduly sanguine, and the responsibility they apportion to lexicography a trifle daunting, not to say utopian. I can only reply that these projects would surely be useful and desirable from a practical standpoint; and that if our theoretical frameworks or, ‘paradigms’ (to use the fashionable term) do not support such projects, then it is time to revise or replace them. By nature, lexicography is an eminently practical enterprise, whose theoretical significance is just beginning to receive proper attention. It will have a prominent role to play during our ‘modern’ age of accelerating specialization and diversification, when conventional discourse practices are no longer adequate. Its new opportunities for a rapprochement with a large corpus of practices may well be the signal for a new golden age of lexicography.


BEAUGRANDE, R. DE. 1982. The story of grammars and the grammar of stories, in Journal of Pragmatics, 6, 383-422.

BEAUGRANDE, R. DE. 1984.  Linguistics as discourse: A case study from semantics, in Word, 35, 15-57.

BEAUGRANDE, R. DE. 1997. New Foundations for a Science of Text and Discourse, Stamford, CT: Ablex.

FODOR, J. and KATZ J. 1963. The structure of a semantic theory, in Language, 39,170-210.

HALLIDAY, M. 1961. Categories of a theory of grammar, in Word, 17/3.

KINTSCH, W. 1988. The role of knowledge in discourse comprehension: A construction-integration model, in Psychological Review, 95/2: 163-82.

KINTSCH, W., & VAN DIJK, T. 1978. Strategies of Discourse Comprehension, New York, Academic.

LYONS, J. 1977. Semantics, Cambridge, Cambridge University Press.

MCCLELLAND, J., & KAWAMOTO, A.. 1986.  Mechanisms for sentence processing, in RUMELHART, D., MCCLELLAND, J. et al. (eds.), Distributed Parallel Processing: Explorations in the Microstructures of Cognition, Cambridge, MA, MIT Press, 272- 325.

SINCLAIR, J. McH. 1992. The automatic analysis of corpora, in SVARTVIK, J. (ed), Directions in Corpus Linguistics, Berlin, Mouton De Gruyter, 379-397.

SINCLAIR, J. McH. 1994. Large Corpora are Here to Stay. Lecture on corpus linguistics at the University of Vienna, June 1994. Available on video from RdB.

TRYON, Ch. 1967. Elementary College Geometry, New York, Harcourt, Brace & World.

WiGGINS, D. 1971. On sentence-sense, word-sense, and difference of word-sense, in STEINBERG, D. & JAK0BOVITs, L. (eds.), Semantics, London: Cambridge University Press, 14-34.