Rosewater , wheel of fortune : Compounding and lexicalisation in seventeenth-century scientific texts

This paper investigates the question of compounding as a productive word-formation process in Scientific English by exploring the concepts of collocation and lexicalisation. The claim is that compounds can exhibit different internal structures, including syntactically ambiguous forms, as is the case with the noun + prepositional phrase. Frequency of co-occurrence and the unique meaning of all elements, together with the phenomenon of technicalisation, argue in favour of such an assumption. Some constraints, however, must be admitted. On occasions, the semantic type of the head noun (abstract, concrete, proper, common) can determine whether a particular construction is to be classified as a compound or not.


Introduction
Modern science in the seventeenth century proposed a new type of learning based on mathematical logic, systematic experimentation, and mechanical models.From a linguistic point of view, this new method made any type of scholastic argumentation redundant, demanding instead of dialectic resources a new lexicon for modern technology, a set of accurate vocabulary items, plus structures which served to describe with precision any type of discovery or scientific development (Hard and Jamison: 25).
Science is a sort of micro-cosmos within the Universe of knowledge.Hence, the language of science, though subject to the general contexts and characteristics of a particular language, has its own distinctive features.
The evolution of science and technology fostered the creation of new words, and sometimes even led to the introduction of new morphological patterns into the language, such as the creation of neoclassical compounds under the influence of Latin and Greek ).The scientific community, represented by the members of the Royal Society, began a debate over the formal characteristics of scientific discourse which bore some similarities to the "Inkhorn Controversy" of the sixteenth century, since in both cases the discussion revolved around how to improve the lexical capacity of English.Some eighteenth century scholars used "words or morphemes from the classical languages as building blocks in scientific terminology" (Beal,14).Others, however, resorted to compounds. 1  An expansion in the reading public in the late seventeenth and early eighteenth century, encompassing not only professional groups and the rich but also the middle classes, favoured the use of the vernacular to transmit "science", even though the style employed by writers varied considerably.In some instances it was simple and clear, as demanded by the Royal Society; 2 in others, it was more literary, complex and dense.3  These two broad styles coexisted in the construction of scientific discourse during the eighteenth century, when a handful of scientific and scholarly societies were created as forums for discussion and cooperative investigation.One of the ways in which scientific English was modified was the creation of new terms.Among these formations, compounds occupied an important position.
In order to contribute to the description of scientific language at the beginning of modern science, I will explore the differences between collocations and these new compound nouns through different degrees of lexicalisation.To this end, the paper will be organised as follows: in Section 1 I will discuss the concepts of collocation and compounding and the process of lexicalisation.Section 2 will present the corpus used in this study.Section 3 will concentrate on the analysis of data, focusing on types rather than on tokens, though some word counts will be also provided.Results will be presented according to the variables of etymology and discipline.Section 4 will deal with some unclear cases, and, finally, in Section 5 I will try to provide some conclusions.Crystal (1997: 69) defines collocation as "the habitual co-occurrence of individual LEXICAL ITEMS.[…] collocations are then, a 1 As a matter of fact, sixteenth century authors such as Ralph Lever in the Arte of Reason had already underlined the usefulness of resorting to compounding in English (quoted by Foster Jones, 1953: 126). 2 In accordance with Baconian stylistic patterns. 3Like Boyle himself in some of his works.SYNTAGMATIC lexical relation.They are linguistically predictable to a greater or lesser extent […]."

From collocations to compounding. The concepts
The very broad aspect of this definition, which derives from Firthian linguistics and was followed by Halliday (1966) and Sinclair (1966Sinclair ( , 1991)), implies that position and frequency of occurrence are both valid criteria for the characterisation of collocations.In principle, there is not a sense relation between the collocates, at least from the very beginning of the use of juxtaposed forms, and their co-occurrence is not fixed.Benson et al. (1986) stated that, among the possible lexical combinations, the different degrees of semantic cohesion allow the establishment of the following taxonomy from less to more cohesive structure:

Compounds
This gradation affords a different perspective on the phenomenon, since the authors here take for granted the existence of semantic coalescence between the lexemes of a collocation.
The frequent co-occurrence of two or more elements can bring about their lexicalisation4 .The term "lexicalisation" itself can be interpreted as fusion or "univerbation of a syntactic phrase or construction into a single word" (Brinton and Closs Traugott, 2005: 48-91).
Consequently, when analysing any sort of collocational structures, different degrees of lexicalisation may be found.Compounding is the final stage: the elements forming a compound must have acquired a high degree of lexical and semantic cohesion so as to be regarded as a complex lexeme or lexical unit.
A compound is defined by Bauer (1983: 29) as "a lexeme containing two or more potential stems".This broad definition is then narrowed by the classification of compounds into four different groups according to semantic criteria, that is, the relationship in terms of meaning between the grammatical head and the preceding element.Compounding, then, means "the unification of parts which are no longer independent, there is stress shift to the first syllable and semantic motivation is lost" (Brinton and Closs Traugott, 2005: 34).
Some authors argue that lexicalisation is a process that affects "larger-than-words objects" (Hohenhaus, 2005), such as fixed expressions, idioms and clichés (Lipka, 2005: 40).However, I agree with Lipka (2005: 40) that "lexicalisation is only motivated for units of the lexicon, like simple and complex lexemes and lexical units, but that institutionalization is not restricted in this way".He goes on to say that most cases of institutionalisation (like some collocations-green with envy-or routine formulas-bottoms up-) are culture-specific.In this sense, in the language of science, one may find that some constructions, after having undergone a lexicalisation process, are discipline-specific.Therefore, I could talk about a "technicalisation" process as the basis of a specialised linguistic domain or jargon, particularly when these multilexematic combinations are known and accepted by the corresponding discourse community.5

Corpus material
The corpus material used in this study was analysed with reference to three parameters: date or time-span, discipline, and lexical category.For the purpose of analysis I have selected texts of two different disciplines, Medicine and Astronomy, these representing text types aimed at different kinds of reading public.The medical text, entitled A Choice Manual of Rare and Select SECRETS IN PHYSYICK AND CHYRURGERY; Collected, and Practised by the Right Honorable, the Countesse of KENT, late deceased, was written by William Shears and published in 1653. 6Astronomy is represented by samples extracted from Armonicum Coeleste: or, the Coelestial Harmonie of the Visible World, dated 1651, and A Book of Knowledge in three Parts, written in 1663 by Samuel Strangehopes.Both these works are likely to be included in the forthcoming CETA (Corpus of English Texts on Astronomy) 7 which forms part of the Coruña Corpus: a collection of samples for the historical study of English Scientific Writing. 8The three works, then, were published in the second half of the seventeenth century, a period when the influence of Empiricism on the discourse of science can be said to have begun.
As can be seen in Table 1 below, a total of 36,268 words will be analysed, 20,874 belong to the medical text and 15,394 to the Astronomy texts. 6Shears" text (1653) has been transcribed by the team compiling the Corpus of Early English Recipes (CoER).This particular sample has been transcribed by Drs Alonso Almeida and Ortega Barrera to whom I am deeply indebted. 7I want to thank Päivi Pahta and Irma Taavitsainen for their counselling in the compilation of CETA which, in turn, has made possible, to a certain extent, the writing of this paper. 8The Coruña Corpus of English Scientific Writing is a current project at the University of A Coruña (Spain) by the Research Group for Multidimensional Corpus-based Studies in English (MuStE).The main interest of the group is the study of language change and variation in scientific texts.One of the subcorpora being compiled is CETA (Corpus of English Texts on Astronomy Nouns are the chosen lexical category here.Sager, Dungworth and McDonald (1980), and, more recently, Nevalainen (1999), among others, have pointed out that they are the most relevant lexical category in scientific writing.In addition, the majority of compounds contain N+N (Quirk et al., 1985(Quirk et al., : 1567;;1570).However, not all members of the nominal class found in texts have been considered in the analysis.Since this paper seeks to explore differences between compounds and collocations, only those structures that are (or appear to be) compounds have been taken into account.Derivatives and simple nouns, then, have been disregarded.Place names and proper nouns have also been disregarded, as well as cardinal points, nouns denoting seasons, days of the week, months and zodiac signs, except when they form part of the compound as in: (1) northeast wind (Strangehopes, 1663: 30) On the contrary, nominalisations of two types, -ing (grafting, moistening) and adjectival (riches, contraries) have both been taken into consideration.

Analysis and results
1,013 different nouns or types were found corresponding to 9,681 tokens, more or less equally represented across the two disciplines (ast: 2,654 + 2,010; med: 5,017).Of all those types, 121 (11.94%) correspond to compounds or compound-like structures.Although in the seventeenth century compounding does not seem to be as productive a process of word-formation as it was in previous periods, especially in Old English, it stands out as a useful mechanism for conveying scientific contents in a simple, clear and concise style, as demanded by contemporary writers of science.The idea of clarity and concision, which was on the minds of seventeenth-century authors, crystallised a century later in Margaret Bryan9 "s preface to A compendious system of astronomy in a course of familiar lectures (1797): I know that I have no claim to the public suffrage, only on account of the clearness of my illustrations, which, as well as the diagrams, are principally original.As to the phraseology, I fear it is too deficient in ornament to procure me any credit; yet I hope the clearness of elucidations may gloss over the imperfections of the stile in which they are delivered:-Had I copied that of other authors, I might perhaps have rendered these Lectures more pleasing, although less intelligible to my pupils; who, being familiar with my diction, understand my illustrations much better, as I have thence been able to deliver them more naturally and forcibly.(Margaret Bryan, Preface, viii, 1797).
The use of compounds is a way of compressing the message without losing simplicity and precision.In the analysis of compound nouns I will work with two variables, etymology and discipline, which will, hopefully, afford new insights into the use of compounding in scientific discourse.

Etymology
Compounds seem to show some peculiarities regarding etymology.For this study the online version of Oxford English Dictionary (OED) was consulted, and the ultimate origin of each term taken.All the different provenances that were found in it were classified into three groups: Germanic, Romance and hybrid 10 .Hybrids contain those forms that combine elements of both Germanic and Romance descent.
The data suggests that compound nouns generally stem from Germanic sources, as seen in Graph 1.

Graph 1.Etymology in compounds
The observed tendency here for the creation of compounds from native stock contradicts previous studies that found Romance sources to be preferred for derivative forms (Moskowich, 2008).Some examples from the current data can be found in (2)-(4): Germanic: (2) earthquake (Strangehopes, 1663: 47), 10 I have chosen to use these labels because of their agglutinating nature ("Romance", for instance, covers Latin and other related languages) though I am aware that much controversy surrounds the terms.My intention has been to simplify, so as to see "vernacular origin" as opposed to "others", mainly of Romance provenance.On the subject of etymologies, the OED3 is undertaking a full revision of etymological origins which, when completed, historians of the English language will have to take into account.

Discipline
According to the second variable, the number of compound types found in both disciplines is approximately the same: 60 in Astronomy texts and 61 in the Medicine text.On closer inspection, though, this numerical similarity belies significant differences, which can be found when examining the etymological origin of compounds in each discipline.

The etymology of compounds and lexicalisations in Astronomy texts
As Table 2 shows, more than half of all the compound nouns found are of Germanic origin (51, 6%), followed by hybrid formations, with more than one third (36, 7%) of the total compound nouns in the Astronomy texts.Instances of Romance provenance represent only 11, 7 % of all these nouns.The clear abundance of Germanic elements in the compounds found in the Astronomy samples is, no doubt, due to the fact that at least one of these texts could be classified as non-professional in nature.The work by Strangehopes is a good example of a text in which Astronomy and Astrology are not yet separate disciplines.In addition, this text provides a basic account of certain daily events in relation to other celestial processes.This could explain the use of common, everyday vocabulary in a "specialised text".The popularisation of knowledge was a locus comunis of the intellectual climate that pervaded seventeenth-century society.The Humanist trend of the preceding century had first introduced the notion that knowledge was of value to all individuals, regardless of their social status.Moreover, Astronomy was seen as a useful and practical science in fields such as navigation (Inkster, 1992: 119).Clearly, though, not everything written was aimed at the same kind of audience.Texts could range in their degree of informativeness depending on whether they were addressed to scholars, less educated readers, or laymen.12I would suggest that "addressee or type of audience" might well condition the lexical patterns used to convey scientific information, however specific and technical the discipline under discussion.
Wing"s book, unlike that of Strangehopes, is addressed to a more specialised audience, as can be inferred from the author"s own words: The first Book containeth those necessary and immediate Elements of TRIGONOMETRY abstractly propounded, which as the foundation to the superstructure, are laid down in a due Method and compendious manner.(Wing, 1651: 1)

The Etymology of compounds and lexicalisation in the medical text
My analysis of the medical text reveals an abundance of compound types with different origins, as can be seen in Table 3:  7) to (9), 22.9% have a mixed etymology, as in ( 10), and 45.9% have been obtained from native forms, as in ( 11): The formal characteristics of the Germanic lexicon used in this book may have played a part in the process of compounding.Monosyllabic items of native provenance which are transparent can be juxtaposed to others to obtain new compounds (Gotti, 1996: 22).These formations could equally meet the pragmatic principle of maximum transparency13 , as in example ( 12): (12) pennyworth 14   Once more, as in the Astronomy sample by Strangehopes, the text is practise-oriented rather than academic; that is, the intended audience is the average practitioner who needs a clear and simple lexicon to grasp the information contained in the written text.

Some unclear cases
The border between compounds and collocations is a fuzzy one.In the data under assessment here there are some instances which do not fit neatly into either of these categories, but which occupy different positions on a lexicalisation scale.Apart from the prototypical elements of the category "compounds" that have been seen before, I have come across certain types that could be viewed as peripheral to the class (Rosch, 1978).
Table 4 below lists these unclear types found in each text: Lord of the Eclipse The center of the planetary orbits Circle of variation 14 This term is not scientific as far as I can see.Perhaps we need the context cited to be sure.In any case, it also has many lexicalised senses, such as a bargain, something of little value, a small quantity of something etc., but it is not the item that is scientific but the text in which it is used.This indicates that there are fewer unclear or peripheral cases in those samples where compound nouns of Germanic origin predominate (Shears" A Choice Manual and Strangehopes" A Book of Knowledge).Though these texts belong to different disciplines, they share a common target audience.They can be included within an informative kind of texttype.

Elliptique
From a structural point of view, these instances can be grouped as follows: Type I. N+N: noun + noun Type II.N+A: noun + adjective Type III.A+N: adjective + noun Type IV.NPs containing an N+PP: noun + prepositional phrase Type I, N+N combinations (northeast wind, south angle, north angle, east angle), show the highest degree of lexicalisation.They might, therefore, be regarded as compounds rather than NPs. 15From a semantic point of view, they apparently convey a single meaning which is predictable from the meaning of the grammatical head, which in turn is modified by the left-hand element.As a result, the compound noun is a hyponym (northeast wind) of the nuclear lexeme (wind).It is a sort of endocentric compound.In this sense, these compounds are precise and specialised, though on some occasions the lexical unit that generates the compound is used in common, ordinary speech (wind).
Multi-lexeme constructions containing an N and an A, either pre-or post-posed (zodiacall circle, circle Equant), could also be analysed as endocentric compounds, hyponyms of the corresponding head rather than NPs.N+A compounds are illustrated by examples such as circle excentrique, circle equant.These are cases that correspond to the French type, in which the adjective occurs in post-position (Moskowich, 2002).Could these combinations be understood to be compounds? 16They should be as the combination of the two simple lexical units generates a third one with a specific meaning within the astronomy discipline.
Less lexicalised are those combinations formed by N+PP (line of the Auges, center of the Orbe, the tangent of the angle).Could they be interpreted, once again, as compounds?For an affirmative answer, the explanation might be that of-phrases as post modifiers imitate French style, transforming English scientific discourse into a more analytic variety of the language17 .In addition, the semantic cohesion the elements of this structure exhibit might be symptomatic of a certain level of lexicalisation.Hence, their consideration, not as mere collocations, but as compounds, is possible.This is especially the case of those examples which admit the double format: N+N or N+PP, as in south angle or angle of the South.In this sense, I agree that "complex lexemes are nominalizations of the respective collocations" (Lipka, 2005: 40), and that it is only a question of time as to how long it takes for each of these structures to become a compound (or not).
In other examples, however, the sequence N+N is not possible: ( Spirit of wine only occurs as N+PP, but in the same text wine glasse is also attested.They are coexisting variants in which the order of the elements differs.Maybe it is only a question before one of the two variants disappears or that they specialise their meanings and acquire different uses. If criteria at different linguistic levels are to be applied as a means of showing whether these instances can be admitted as compound nouns or not, phonology must first be discounted.Since we are dealing with written material, stress cannot be used as an accurate indicator of compounding.Spelling is not a valid criterion either, since no process of standardisation had been completed at the time (see, for example, the alternative spellings earth quake/earth-quake/earthquake in the same texts here).Neither does the fact that two words appear without a hyphen necessarily indicate they are independent members of the same NP, as was seen above.
From a syntactic point of view, there are three properties that could play a part in determining compounding, namely, the recursivity principle, the right-hand head rule, and premodification by the intensifier very.None of these can be applied to the above-mentioned instances.
Some morphological constraints also act as obstacles in the interpretation of these sequences as compounds.The right-hand element is not always marked for number.In my view, some semantic 18 My special thanks to the anonymous reviewer of this paper who pointed that the OED attested leaf-fall as a poetic expression.Leaf-fall occurs under the entry leave, n 1 as a special combination used in poetry and in Botany.This seems to support the idea that when the referent of the lexical units in the NP (N+PP (P+NP)) form part of ordinary speech and, consequently, are frequently used, the structure is more prone to lexicalise and have an N+N counterpart that is understood as a compound.
impediments are also inherent to the right-hand element (such as being either a proper or an abstract noun).21), on the one hand, and ( 22)-( 24) on the other, can be observed.More extreme approaches would consider prepositional phrases in the above examples as phrasal adjectives functioning as post modifiers in a Noun Phrase (Gross, & Miller, 1990) instead of as prepositions embedded in a compound, as in French (Di Sciullo, 2005).
On the contrary, these unclear cases can, from a semantic point of view, be considered compounds, since they are perceived as a single unit and express a single content (Zanvoort, 1972).Semantic narrowing (or a restricted use, at least) seems to be playing a part in the way in which these structures are perceived.Moreover, there seems to be some sort of etymological conditioning also in the sense that a more restricted context of use is often associated with non-Germanic origin in the data from my corpus, for example in Center of the moon vs center of the epicycle.Those terms that seem to be more specific are less frequently used and come from "Latinate" languages.

Conclusions
The findings in this paper show that compounds mainly descend from a Germanic language, maybe conditioned here by the fact that two of the three text samples under survey are addressed to a less literate type of audience.The vocabulary had to be within the reach of the reading public interested in scientific matters since scientific texts as a specificpurpose product were widely disseminated, and were carefully attuned to the demands of the audience.
The combination of two potential stems/bases does not always immediately end in the formation of a prototypical compound.
Combinations of N+N, A+N or even N+A and N+PP can form examples of compounding, although this claim comes with some qualifications, especially relating to the provenance of vocabulary items and the influence of French syntactic structures.Is circle equant less a compound than attorney general?No doubt the fact that both items have a functional distribution, and are therefore to be found in particular text-types (Görlach, 2004), explains that both can be considered compounds.
The phenomenon of lexicalisation is measured in terms of semantic cohesion among the items of a collocational/compound structure or, what is the same, through the systematic "association of lexical items that regularly co-occur" (Halliday and Hassan, 1976: 284).
This cohesion appears to be less so when there are some intervening grammatical words, as is the case with the preposition of or any article (either definite or zero).But the weight of the lexical items in the construction contributes to giving it a greater semantic cohesion, so as to consider its structure as compound-like.Although formally speaking they are closer to NPs, I argue that they have undergone a lexicalisation process and a parallel technicalisation phenomenon.As far as the social use of these expressions, they seem to be discipline-specific (the sine of the angle; the circumference of the pricked).With other PPs in which the head of the NP is of Germanic provenance, technicalisation seems to lose strength and, simultaneously, cohesion seems to be of a lesser degree.We can speak, then, of NPs (centre of the Earth, centre of the world).

References
fixed stars (Strangehopes, 1663: 25) (4b) gumdragon (Shears, 1653: 985/6) Orbite Elliptique circle Circumference of the pricked Center of the pricked Centre of the Equant Sine of he Hypothenusall Centre of the world Circle of Altitude The center of the orbite The sine of the angle The tangent of the angle Zodiacall circle A total of 44 unclear types (4.4 %) are distributed as follows: only one type in the Medicine text, but 43 in the Astronomy samples (32 in Wing"s Harmonicon and 11 in Strangehopes" A Book of Knowledge).

Table 2 .
Compounding and etymology in Astronomy texts

Table 3 .
Word formation and etymology the Medicine text Of all cases, 31.14% have a Romance provenance, as in examples (

Table 4 .
Unclear types Primary Sources Strangehopes, Samuel.1663.A Book of Knowledge in three Parts.Shears, William.1653.A Choice Manual of Rare and Select SECRETS IN PHYSYICK AND CHYRURGERY; Collected, and Practised by the Right Honorable, the Countesse of KENT, late deceased.Wing, Vincent.1651.Armonicum Coeleste: or, the Coelestial Harmonie of the Visible World.