Citation Forms in Scientific Texts: Similarities and Differences in L1 and L2 Professional Writing

This study investigates the use of citation forms in 30 scientific research articles in biology, chemistry and physics written by writers in L1 and L2 contexts. Citation forms were divided into integral (syntactically integrated citation) and non-integral (syntactically non-integrated). Integral citation was further categorized into subject position, non-subject position (passive; clause constituent) and noun phrase (adjunct agent structure; phrase constituent), such as “according to.” Findings show that although few papers were cited in integral citation across the disciplines, writers in the L2 context mainly employed them in a subject position while writers in the L1 context spread them over three positions, creating stylistic variation.


Introduction
To be able to produce academic texts in English, non-English speaking novice writers need to master various means to strengthen their argument, one of which is thought to be citation (Block & Chi 1995;Charles 2006;Dong 1996;Dubois 1988;Harwood 2004;Hyland 1999Hyland , 2001;;Salager-Meyer 1999).Non-English speaking novice writers need to learn not only what to cite but also how to cite previous studies (Swales 1986(Swales , 1990(Swales , 2004)).Although disciplinary variation in the use of citation and citation forms has been analyzed previously (Hyland 1999(Hyland , 2000)), relatively little attention has been paid to variation due to the writers' mother tongue.It may be the case that L2 writers have more difficulty in the use of citation forms to construct a persuasive argument than L1 writers.It would be useful to investigate differences in use of citation forms between L1 and L2 professional writers; however the names and affiliation of writers do not always indicate the writers' mother tongues.Thus I shall refer only to the linguistic environment of the writers, i.e. the English speaking environment (L1 context) and the non-English speaking environment (L2 context).The purpose of this study is to compare use of citation forms in 30 scientific texts in biology, chemistry and physics written by writers in the L1 and L2 contexts.

Previous studies
In discourse analysis, citations have often been examined with reference to reporting verbs (Charles 2006;Hunston & Thompson 2003;Hyland 1999Hyland , 2001;;Shaw 1992;Thomas & Hawes 1994;Thompson & Ye 1991).For example, Thompson & Ye (1991) studied the introduction sections of more than 100 papers to examine how writers show their evaluation of previous work, and interact with their discourse community, through the reporting verb.They also showed that writers reveal their positive and negative evaluation of previous studies by their choice of reporting verbs.It appears that negative opinion is often presented in a more subtle manner in context than positive evaluation (Thompson & Ye 1991: 374), and might therefore only be evident to insiders of the discipline.
This insiders' perspective in citation has been investigated in studies on citation analysis, which are closely associated with the disciplines of information science and sociology of science, (Cozzens 1985;Small 1982;Shadish et al. 1995;White & Wang 1997).However, although integration of the findings of studies on citation analysis with discourse analysis has long been proposed -early on by Swales (1986), and more recently by Harwood (2004) and White (2004), researchers in these disciplines still seem unfamiliar with the achievements of one another (White 2004).Because citation is crucial in the analysis of academic texts (Hyland 1999(Hyland , 2000)), researchers in discourse analysis can benefit from the findings of information science and sociology of science.
Citation analysis is a relatively new area of study, originating from an initiative to launch citation indexing by the pioneering information scientist Garfield (1955) but has used three approaches (Liu 1993;White 2004), which show some resemblance to those of discourse analysis.The first and most dominant of these concerned the retrieval of cited work in the discourse community (Cole 2000;Garfield 1955).The number of citations was used as a criterion to judge the importance of the work within the discipline, based on the assumption that the more citations a paper obtains, the greater impact it has on the academic community (Cole & Cole 1971;Merton 1973).In discourse analysis this can be compared to the quantitative analysis of linguistic forms, such as investigating the frequency of passive voice compared with active voice (Salager-Meyer 1992).
However, citation counting was often criticized because papers are not all cited for the same reason.It was argued that the analysis needs to examine the function of the citations within a text, as some are employed to help to establish a theoretical framework while others are cited negatively (Moravcsik & Murugesan 1975).Thus categories such as negative citation and developmental citation were created to classify the roles of cited work in a paper, initiating another approach to citation analysis: content analysis (Moravcsik & Murugesan 1975, Moravcsik 1985).Researchers on content analysis analyzed the surrounding context of citation papers, and tried to evaluate the role of cited work in context.Interestingly, the role of context also drew attention to the analysis of academic texts.For example, Shaw (1992) pointed out that the choice of passive vs. active was influenced by contextual information, i.e. the organization of information in a text rather than any decision at the sentence level.
Despite some genuine efforts to classify the content of citations, the limitations of this approach became apparent.First, it was found that one citation may belong to more than one category (Cano 1989;Chubin & Moitra 1975).Second, Moravcsik & Murugesan (1975) focused on 30 theoretical high energy physics papers published in Physical Review.However, it was found to be impossible to apply the same categories across the disciplines (Chubin & Moitra 1975).Last but importantly, MacRoberts & MacRoberts (1984) claimed that linguistic analysis of citation does not always reveal the real intention of the writer because writers may mitigate their critical comments, which led to demands for understanding of the writers' intentions behind the citation.Thus the limitations of content analysis directed researchers to investigate the actual reasons for citation, i.e. the citers' motives (Brooks 1985(Brooks , 1986;;Budd 1999;Cronin 1998;Shadish et al. 1995;Wang & White 1999;White 2004).
Two types of motivation were put forward, based on either normative theory or on a micro-sociological perspective (Liu 1997).The former assumes that citation is for merit-granting, which was originally considered to be the main reason for citation (Cole & Cole 1967, 1976;Davenport & Cronin 2000;Merton 1973), as citation is part of the collective activity of knowledge construction in the discourse community.In contrast, the latter argues for persuasion: the citer's knowledge claim being seen as the major motivating factor in citation (Brooks 1984;Case & Higgins 2000;Latour 1987).In an influential paper, Gilbert (1977) argued strongly that writers cite in order to persuade their readers.His argument was so influential that it shifted attention from citation itself to the role of citation in a text, examining the individual writers' viewpoint rather than that of the discourse community.Gilbert (1977) even argued that works by authoritative figures in the discipline were likely to be cited because readers would be persuaded by the names.However, others argued that the choice of citation would not be based on the names of the researchers but the content of the work (Cozzens 1989;Zuckerman 1987).Subsequent studies have tried to balance the argument by presenting the idea of "rhetoric first, reward second" (Cozzens 1989), and this has since been confirmed by interviews with writers of academic texts concerning the motivation for citation (Case & Higgins 2000;Shadish et.al 1995;Vinkler 1998;Wang & White 1999).
Some discourse analysts also took a similar approach to understanding the writers' motivation behind the impersonal linguistic forms.Myers (1989Myers ( , 1990) ) suggests the analysis of a social dimension in scientific discourse, showing how scientific research articles employ politeness strategies: positive politeness for solidarity, and negative politeness for deference to the discourse community (1989,1992).Hyland's studies (1999Hyland's studies ( , 2000) ) combined interviews with academics and analysis of a large corpus of academic texts, presenting similar views of the citers' motivation to those found in citation analysis (Brooks 1986;Cozzens 1989).He states that: Reference to previous work is virtually mandatory in academic articles as a means of meeting priority obligations and as a strategy for supporting current claims.(Hyland 1999:362) While citation analysis focuses on the use of citation itself, discourse analysis could further enquire into the purposes of citation forms.Citation forms may have the same purposes as those of citation found in citation studies.Swales originally categorized citation forms into two types (1990): integral for syntactically integrated citation, and non-integral for syntactically non-integrated citation.Findings of the studies on citation forms show that social science disciplines such as politics use more integral citation forms than do natural sciences (Charles 2006;Hyland 1999).
Hyland divided integral citation further into three categories: subject, non-subject (passive) and phrase-level constituent or adjunct agent structures, such as "according to …" (1999:347).In an analysis of his academic corpus Hyland found that physics, mechanical engineering and electronic engineering papers prefer non-subject position to subject position, showing the disciplines' preference for the impersonal structure of a sentence, with phrase-level constituent or adjunct agent structures being the least common choice (less than 20% of all the integral citation forms) in these disciplines.In contrast, biology was the only field that preferred subject position (46.7%) to non-subject position (43.3%) for integral citation.
Although the use of citation forms in academic texts has recently been examined across disciplines (Charles 2006;Hyland 1999), few studies have analyzed the differences in such use between L1 and L2 writers in published academic texts.L2 professional writers share similar knowledge about citation practices, but they may not always be linguistically as skillful as L1 professional writers in their realization in their own texts, as was shown in an analysis of covering letters written by L1 and L2 professionals accompanying a manuscript for publication (Okamura & Shaw 2000).The present study intends to help clarify what L2 novice writers need to pay attention to, when they use citation forms.However, because it is impossible to distinguish L1 writers from L2 writers from their names and affiliations, in this study the distinction is only made between those in L1 and L2 contexts based on the affiliation of the writers.

Research questions
This study examines scientific research articles written by writers in the L1 and L2 contexts in three scientific disciplines.Research questions are 1) How do writers in the L1 and L2 contexts use integral and nonintegral citation forms?2) How do writers in the L1 and L2 contexts use the three locations of integral citation?3) What are the possible purposes of citation forms?

Data collection
This study does not include academic papers in humanities and social sciences, because they use quotations as part of the citation and would thus require three categories for the analysis of citation.To compare the use of integral versus non-integral citation, the analysis of research articles was limited to three scientific disciplines.The analyzed papers were published only in American journals (see Appendix).Because scientific journals are often published by the national scientific community, such as the American Chemical Society, scientific research articles can be influenced by the policies of the individual national scientific community.The textual data is shown in Table 1.The journals were recommended by subject specialists as being prestigious in their discipline and the articles were chosen at random from issues published in 2001, including only full research articles and excluding review articles and short communications of one or two pages.One concern for the analysis of data is referencing type; because each journal follows the most common referencing type for non-integral citation in the discipline, chemistry and physics papers used a sequential numbering system such as (1, 2, and 3) while biology papers employed an author-date system such as (Smith 2008).As the sequential number type creates a more noticeable difference between integral and nonintegral citation, the interpretation of the findings needs to be treated with some caution.Integral citations were counted first in relation to the total number of citations.Then they were categorized according to their syntactic functions of subject position, non-subject position (passive; clause constituent) and phrase constituent (adjunct agent structure) such as "according to…" as suggested by Hyland (1999:347).Token and type numbers of the integral citation in the main body of the paper were counted, to identify whether the same paper was cited more than once by the writers.Type and token number of cited papers is included; type number refers to cited papers listed in the references, while token number is the number of references in integral citation appearing in individual papers.Expressed more simply, one cited paper was counted as one type, but two appearances of it in a paper became two tokens.
The use of syntactic locations of integral citations was grouped into three according to Hyland's distinction (1999).Here I shall adopt Hyland (1999), and Thompson & Ye's (1991) definition of "writers" referring to those citing papers, and the cited person as the "author".
Writers in the L1 contexts were affiliated to universities in English speaking countries, while writers in the L2 context were Japanese writers working for universities in Japan.However, to be able to associate the L1 context with L1 writers and the L2 context with L2 writers as far as possible I avoided including Japanese names as authors among the papers written in the L1 context, and non-Japanese names as authors among the papers written in the Japanese L2 context.Japanese writers were chosen here because they are among the major L2 contributors of scientific research articles (Swales 2004).

Use of citation forms in scientific disciplines in the L1 and L2 language contexts
Table 2 shows the number of integral and non-integral citation forms, and the distribution of integral citation according to three positions in the 30 research articles.Of the total citations found in the 30 papers, integral citation accounted for only a small percentage in both L1 and L2 contexts (6.4% in L1 and 5.5% in L2).The small number of occurrences was also shown in the academic texts in hard-scientific fields in Hyland's corpus (1999).Table 2 also shows the total type number of integral citation, as opposed to that of the total token number, to present whether the same paper was cited more than once.It is interesting that Table 2 shows that writers in the L1 context (3.6%) tended to have fewer type numbers of integral citation than writers in the L2 context (6.6%).In other words, some writers in the L1 context tended to cite the same paper more than once in integral citation.
Another noticeable difference was the location in a sentence.Writers in the L1 context used integral citation mainly in a non-subject position (50%), with fairly limited use of a subject position (roughly 27%).These locations could perhaps have been chosen to support the positivist principles of science, (Scollon & Scollon 2001) as both positions of nonsubject and noun phrases enable the subject of a sentence to be impersonal.In contrast writers in the L2 context used almost 70% of all the integral citation forms in a subject position, with few instances in other positions.
Since the proportion of the subject position in scientific texts in Hyland's corpus (1999) roughly corresponds to the usage of writers in the L1 context, the dominant use of a subject position by writers in the L2 context in Table 2 appears to deviate from the norms and requires a more detailed analysis.

Use of integral citation in three locations across the disciplines
Table 3 shows the disciplinary variation of the use of integral citation in papers written by writers in the L1 and L2 contexts.Across the disciplines the subject position seems to be preferred among the writers in the L2 context, while the non-subject position is the most common among those in the L1 context.The dominant use of subject position among the writers in the L2 context was most evident in biology papers, as they used integral citation most compared to papers in other disciplines; this was also shown in Hyland's large corpus (1999).Because the numbers above may reflect individual writers' choice rather than disciplinary and linguistic contexts, Tables 4, 5 and 6 present individual papers' use of integral citation.Due to the small number of instances of integral citation among papers in chemistry and physics, Table 4 and Table 6 show that little variance was found in the token number of integral citation.In contrast, as biology papers employed many more instances of integral citations, Table 5 shows diversity in the token number among papers in this discipline.Three biology papers employed no integral citations, while one biology paper used six tokens of integral citation.One shared element of papers across the disciplines was the maximum token number of integral citation in a paper: it was limited to six in the three disciplines.Why do writers in the L1 context tend to use integral citation in a nonsubject location?What might be the motives for the use of integral citation?

Purpose of the citation forms used
Studies in citation analysis have found that citation is used to persuade readers and acknowledge previous studies (Shadish et al 1995;Vinkler 1998;Wang & White 1999;Case & Higgins 2000).The present study investigates whether this applies to the choice of citation forms in scientific research articles.To examine the purpose behind the use of citation forms, integral citation was chosen because it stands out in a text more than non-integral citation; writers need to be more selective about what to cite in integral citation.Among the 30 papers analyzed, no more than 13 papers in total employed more than one instance of integral citation in a paper.Of these 13 papers, only two were from physics; writers of physics papers seem keen to maintain an impersonal stance in their writing.Little difference was found to be due to the writers' language contexts in the number of instances of integral citation: of the 13 papers with more than one instance of integral citation, seven were by writers in the L1 context and six by those in the L2 context.
To examine the functional role of integral citation, token and type numbers of cited papers were counted.If one cited paper (one type number) appeared more than once (more than one token number), it can be hypothesized that integral citation of these particular works has a definite purpose, as they are being foregrounded quite sharply.The analysis shows that six papers cited the same paper more than once in integral citation.They were all from chemistry and biology; no papers in physics referred to a paper more than once.While it is not the case with physics papers, chemistry and biology papers seem to refer to the same paper more than once in integral citation.The difference between papers in the L1 and L2 contexts seems to be relatively minor.Among the six with repeated references to the same work, four occurred in the L1 context, while two papers in the L2 context cited the same paper more than once in integral citation.
Papers were also compared in relation to the location of integral citation.Four instances was the maximum token number per paper of integral citation in a subject position among all the papers analyzed.There was little difference in the maximum instances of token number in papers written by writers in the L1 and L2 contexts, at three and four instances respectively.However, a difference appeared in the most common location of instances of token numbers.Among the six papers which employed more than one integral citation in a subject position, only one was by a writer in the L1 context.
The following two extracts from biology papers exemplify a difference between the papers written in the L1 and L2 contexts.The first biology paper was written in the L2 context with four integral citations in a subject position.
Excerpt 1 Demura andFukuda (1994) andFukuda (1997) have presented a hypothesis that the process of differentiation of zinnia mesophyll cells into tracheary elements is divided into three stages; […].Iwasaki and Shibaoka (1991) examined the time at which exogenous BL is required if zinnia cells are to differentiate into tracheary elements and indicated that the BL-requiring stage is late stage II.We have demonstrated that BL is a prerequisite for the expression of stage III-specific genes but not for that of stage I-or stage II-related genes (Yamamoto et al. 1997).
In excerpt 1, as the first three names Demura andFukuda (1994) andFukuda (1997) were authors of the citing paper, it can be interpreted that they seem to have been used to emphasize the findings of their own work; this appears to confirm the hypothesis that placing integral citation in a subject position is to acknowledge some previous work.By contrast, the second integral citation Iwasaki and Shibaoka (1991) seems to be employed to draw attention to a contrast between the cited paper and the writers' own paper.Do they need to have the cited authors in a subject position for the sake of persuading their readers?A contrasting pattern was found in a paper by writers in the L1 context.
Excerpt 2 Earlier studies with broccoli florets, stored at 5C, showed that fatty acid levels decreased during postharvest senescence, and levels of peroxidation products increased at both 5C and room temperature (Zhuang et al., 1995(Zhuang et al., , 1997)).These authors concluded that lipid deterioration of broccoli occurred in storage.In contrast to the observations made on the material stored at room temperature, we observed a significant increase in the TBARM content in tissues stored at 4C similar to that reported by Zhang et al. (1995).
The writers of excerpt 2 seem to have avoided the repetition of the same work in integral citation: Zhang et al. (1995Zhang et al. ( , 1997) ) with the use of earlier studies and these authors, which results in less focus on these previous studies.Then the writers of excerpt 2 placed emphasis on their own findings with the use of we, and again less attention to one of the previous papers, Zhang et al. (1995) with the use of its non-subject position as in reported by Zhang et al. (1995).It is interesting to note that both writers in the L1 and L2 contexts used the same voice, but created different results in drawing attention to their own work and less attention to the cited work.

Discussion and conclusion
This study has investigated how writers in the L1 and L2 contexts use citation forms in research articles in chemistry, biology and physics to construct a persuasive argument.The results show that writers in both contexts used integral citation forms in only 5 or 6 percent of the total number of cited papers in the scientific research papers examined here.This confirms earlier findings of much less frequent occurrences of integral citation in scientific research papers than those in social science and humanities (Hyland 1999(Hyland , 2001).
An appreciable disciplinary difference in instances of the use of integral citation was discerned between biology on one side and chemistry and physics on the other.Biology papers employed integral citation most, as was also shown in Hyland's study (1999).It can be said that they allow more personal involvement in scientific discourse, while chemistry and physics papers prefer to maintain an impersonal stance to the writers' argument.However, we may need to consider referencing type as part of the motivation for the fewer instances of integral citation.The chemistry and physics journals here adopted the sequential number system for citation, i.e. using names of the cited authors for integral citation and stating a number for non-integral citation.This referencing type may have discouraged the writers from using integral citation, as it creates a sharp contrast between integral and non-integral citation.
Given the few instances of integral citation per paper in chemistry and physics papers, its role might be rather limited.However, in biology papers some instances of selective use of integral citation seem to serve the writers' purposes: acknowledgement of previous studies and maintenance of attention to the writers' own findings.
The acknowledgement of previous studies was shown in the repetitive reference to the same work in integral citation: because it stands out among other cited papers, it will be seen as crucial to the writers' argument.By the same token the location of a subject position of integral citation may also be employed, for the same reason.As the subject position gives more prominence to the authors than the nonsubject position and part of noun-phrase construction, it can draw more attention than these can.
The emphasis of the writers' own findings was shown in the nonsubject position and the use of noun-phrases for integral citation.As these do not break the flow of scientific argument with the insertion of cited authors' names, as in the case of the subject position, its use helps the writers expand their own argument and to direct the readers towards their knowledge claim.In fact, when one biology paper written in the L1 context employed as many as six integral citation forms, none of the integral citations appeared in a subject position.Use of the three types of integral citation are also useful as they create some variation in a text, and the use of three locations can help to produce a gradual shift of attention to and from cited authors.Non-integral citation also helps to maintain the readers' attention to the writers' work, as was shown in excerpt 2. It seems quite natural to have cited papers mostly in nonintegral citation.
This study shows that citation forms are employed to acknowledge some of the previous work and to draw attention to the writers' own findings.The use of citation forms appears to be the linguistic realization of citer motivations identified in citation analysis studies (Budd 1999;Cronin 1998;Shadish et al. 1995;Wang & White 1999;White 2004).The possible gap in awareness of the use of integral citation seems to be that writers in L2 context appreciate integral citation as a rewarding system, but may not explore it as much as those in L1 context to promote their own work.
Novice researchers often learn about the role of citation, but may not be aware of the use of citation forms to fulfill certain purposes.Thus, based on the findings of this study, it may be useful to discuss with novice L2 writers the fact that citation forms are purposeful; integral citation can be employed to highlight important previous studies, but attention also needs to be given to the writers' work in an academic text.Choice of integral citation can influence the writers' attempt to persuade readers in a text.However, it should be noted that it is possible to write academic texts without appealing to integral citation, as has been shown in the present study.Thus, it should be remembered that use of integral citation needs to be limited and if employed, its use in a subject position needs to be reserved only for specific papers that the writers find necessary to acknowledge as an important contribution to their studies.
Finally, of course, more studies would be useful.For future studies, it would be interesting to analyze academic texts with quotations in the fields of social science and the humanities.Furthermore, to better understand academic discourse it may be necessary to integrate the results of this study with data on reporting verbs and the use of tense in academic texts.

Table 4
Use of integral citation forms in the individual papers in chemistry