Refugee or migrant ? What corpora can tell

It has been suggested that there is a clear difference between the terms REFUGEE and MIGRANT (for example Edwards 2015). A migrant is someone who chooses to move in order to find work or a better life, while a refugee is forced to move because of threat to life or freedom. This study looks at how the two terms are used in British English today and explores what contemporary corpora can reveal about changes coinciding with the escalation of the European migrant crisis.


Introduction
In August 2015, UNHCR published an article on their website in which Adrian Edwards discusses the words refugee and migrant.He points to the difference between refugees, whose status is defined and protected in international law, and migrants, who are subject to the immigration laws of individual countries, and argues that "[t]he two terms have distinct and different meanings, and confusing them leads to problems for both populations."(Edwards 2015).
As a linguist, it is difficult to accept casual claims about the meaning and use of words without wanting to examine them in more detail.As a corpus linguist, the natural approach is to ask 'what can I find in a corpus?'For this paper I want to see what can be found out about the current use of the terms REFUGEE and MIGRANT1 in contemporary British English, and explore whether the patterns of use can be seen to have changed with the recent developments in Europe sometimes referred to as 'the European Migrant Crises' (Wikipedia 2016;UNHCR 2015).The aim is not to provide a complete picture but rather to see what obvious patterns emerge when looking at two corpora of Present-day British English, and to reflect briefly on the type of research that is made possible through the existence of such resources.
The question of differences and similarities between REFUGEE and MIGRANT is not new and has received some attention recently.The Migration Observatory, based at the University of Oxford, published a report on the use of the words immigrants, migrants, asylum seekers and refugees in British newspapers (The Migration Observatory 2013).They compared the terms, focussing on (L12 ) collocates, and found, amongst other things, that MIGRANT often appeared with words related to economics or work, while the collocates of REFUGEE related to conflict, fleeing and nationalities.The research project 'Discourses of refugees and asylum seekers in the UK press, 1996UK press, -2006', based at ', based at Lancaster University, produced a number of studies where the use of different terms was examined using a combination of discourse analysis and corpus linguistics (for example Khosravinik 2010;Gabrielatos and Baker 2008).These studies are all based on data from before 2015 and the start of what has been referred to as 'the European Migrant Crisis'.Camilla Ruz discusses the words used to describe migrants in her article from 2015, but she does not look at actual usage (Ruz 2015).Although these studies are all interesting as a reference point, they do not illustrate what is happening in the language today, as the crisis is developing and being discussed frequently in various contexts.
For the present study, I have been fortunate to be able to use a very recent version of the Oxford English Corpus, (Oxford University Press n.d.) which allows me to examine data that stretches into March 2016.For comparison and reference, I have also used a well-known standard reference corpus, the British National Corpus (Burnard 2007) 3 .

The British National Corpus (BNC)
The material included in the British National Corpus (BNC) has been selected to jointly provide a representative snap-shot of the British English language towards the end of the 20 th century.It has been used extensively as a source of data upon which to base observations about contemporary language.The corpus contains samples of written and spoken language from a range of contexts.They date from 1960-1993, with the bulk of the material from the period 1985-93.As such, the corpus may today be considered not so much a mirror of the language as it is today but as a historical reference to the language of the recent past against which new features or trends may be pitted.This is particularly useful when examining very recent developments, such as the potential change in patterns of use of REFUGEE and MIGRANT happening in the last year.
When comparing two similar words, differences in frequency can be revealing.Looking at MIGRANT and REFUGEE in the BNC, one immediate observation is that the frequencies differ: there are only 681 instances of MIGRANT compared to 2,723 occurrences of REFUGEE.The proportion of singular uses is somewhat higher for MIGRANT: 40% compared to 31% of REFUGEE (see Table 1).Interestingly, no lexical verbs are used with migrants as subject or object more than ten times.The most frequent are seven instances with came and five with attracted.
9. Migrants came in waves with their babble of tongues.(HH3 9352) 10.York had a stronger pull than smaller towns and attracted migrants over much longer distances than most places.(HWD 754) Like migrant, the singular form refugee is mostly used as a pre-modifier to a noun.The most frequent nouns are camp/camps, which make up nearly one in five of the instances (19%, 160 instances) and child/children (65 instances).
11.There were similar jubilant scenes in refugee camps in Lebanon.(AJM 617) 12.The first thing a refugee child learns is the name of his or her home village in Palestine.(APD 885) There are no instances of refugee labour and only five of refugee worker.
The phrase refugee problem is found 32 times, but there are no instances of migrant problem.
14.But it will not be contained effectively if we do not deal with the refugee problem immediately.'(AJ6 775) To sum up, this brief look at of the instances of MIGRANT and REFUGEE in the BNC shows that REFUGEE is about four times as frequent as MIGRANT.The most common use of MIGRANT is to refer to working people that move: migrant labour/labourers or migrant worker/s.Some instances also refer to migrating birds or animals.
Frequent adjectives preceding MIGRANT are economic and illegal.
There are no particularly noticeable patterns where MIGRANT is used with certain verbs.REFUGEE seems to be used more in relation to the situation of refugees, for example refugee camps and refugee children.There are also references to the origin of the refugees.The phrase refugee problem is also found, usually in relation to something happening abroad, not in the UK.Refugees are seen to flee and arrive.The word also co-occurs with return where REFUGEE is either the subject or object, and help, where REFUGEE is the object.

The Oxford English Corpus (OEC)
Although the BNC contains material that is now at least 20 years old it can still be used as a reference source to get an understanding of general language use.To be able to examine recent linguistic development and current use of vocabulary, however, another corpus is needed.The Oxford English Corpus is a very large collection of language data collated by Oxford University Press, mainly to be used as a resource for their dictionary makers and writers of language textbooks (Oxford University Press n.d.).
While the reference corpus BNC was created to be a sample of language from a point in time, with carefully documented and published sampling criteria and composition of the corpus, the OEC is a monitor corpus which grows with time as additional material is added.It contains a large proportion of material that can be found and used relatively easily, for example by harvesting material from online newsfeeds and similar.That means that the proportion of news material in the corpus is considerable, while types of text that are not generally found online are less prominent, such as spoken language and private correspondence.Little information is available about the sampling strategies used for the OEC, and no detailed break-down of the composition of the material is published.
Users of the OEC have access both to a stable 'frozen' version of the corpus and to more recent updates.The current study uses the New Monitor Corpus, March 2016 (n6) which according to the corpus site contains over 6.6 billion words (7.7 billion tokens) from over 23 million documents.Searches have been restricted to material classified as 'British English' with a creation date between March 2012 5 and March 2016.This material comes to just over 740 million tokens.Unless otherwise specified, the term OEC will henceforth be used to refer to this material, excluding non-British material and material from other or unknown periods.Searches are made using the Sketch Engine tool ('Sketch Engine' n.d).
Searching this version of the OEC retrieves 23,737 instances of MIGRANT and 24,306 of REFUGEE (using the 'simple query' 6 ).That means that there is hardly any difference in the overall frequency of the two words.Although the BNC and OEC are not necessarily suitable for comparison in every respect it may be interesting to note that in the BNC, REFUGEE is approximately four times more frequent than MIGRANT.Comparing relative frequencies is difficult as corpus size is calculated differently, but a rough estimate suggests that the terms are more frequent in the OEC overall. 7 As the OEC is a monitor corpus, with new material added for each month, it is feasible to look at the frequency of the two words across time.Figure 1 shows the relative frequency of REFUGEE and MIGRANT for each month from May 2012 to March 2016. 8 What is immediately obvious from the diagram is that at the beginning of this period, frequencies for both words are low with little difference between them, REFUGEE being slightly more frequent.From April 2015, the relative frequency of MIGRANT increases, to peak in September 2015 when the relative frequency is nearly eight times higher than it was six months earlier.The use of REFUGEE also increases, but 5 Searches in the corpus do not retrieve any results from material classified as 'British English' for March and April 2012, which means that in practice the corpus covers the period from May 2012. 6The Sketch Engine manual specifies that "Simple query tries to find all word forms including inflected variants "https://www.sketchengine.co.uk/simple-query/ 7 The calculation is based on the relation between OEC tokens and words (7.7 billion tokens and 6.6 billion words according to the corpus site), suggesting the component examined here is roughly equivalent to 630 million words.That means the frequency of either of the two terms in the OEC is around 38pmw.That is higher than in the BNC, but comparisons may be skewed by different definition of the concept 'word'. 8As the corpus searches do not retrieve any material from March or April 2012 with the classification 'British English', these months have been excluded from the graph.not until August 2015, when there is a substantial change.Use of both words seems to decrease slightly after September 2015, but it is still much higher than it was a year earlier or at any time before that.So, what is the explanation to this increase?There are two likely reasons why such a significant change can be seen.What may seem most interesting is the idea that something has affected the language use so that these words are now used more frequently.That would not necessarily be because users suddenly develop a particular fondness for these words but would also be seen if topics referring to migrants and refugees were discussed more.If such topics feature more in newspapers and general discourse, the frequency of these terms would also increase.It has been suggested that the 'European Migrant Crisis' started in April 2015 (Wikipedia 2016).At that time, many news outlets started reporting more on people trying to cross the sea to reach Europe.This correlates with the pattern in the graph which shows that MIGRANT is used more frequently from April 2015.Another possible explanation to the increase relates to the composition of the corpus.If the corpus contains roughly the same kinds of texts in equal proportions for each month, we would not expect the composition as such to affect the frequency of various words.However, if different kinds of material are included in different periods, this could affect what is found when the distribution of an item is examined across time. 10There is little information about how material for the OEC is selected, and how similar or different sections may be.It is, thus, not possible to say to what extent differences in the composition of the corpus parts affects the distribution of instances of MIGRANT and REFUGEE.It is an interesting question, and cannot be disregarded completely, but for the current study looking at how the words are actually used in the corpus may be more rewarding.
To see if the increased frequency of the two terms coincides with changes in the use of certain linguistic patterns, the OEC material has been divided into two sub-sets, earlier (2012-14) and later (2015-16).The size of the two sub-sets is, approximately, 580 million and 160 million tokens respectively.Reference will at times be made to patterns found in the BNC, recognising that the corpora are very different in composition and not necessarily suitable for detailed comparisons.

Comparing use in the BNC and OEC
In the BNC, a large proportion of MIGRANT, especially the singular form, was used as a pre-modifier with the nouns labour(ers) and worker(s).In the OEC, a similar pattern is found in the material from 2012-2014.Over half the instances of MIGRANT used to modify a noun are found with worker (1,200 instances), which makes it ten times as common as any other noun in this context.Labour and labourer are also found with MIGRANT, as are boat, child, population and community.There are only 16 instances of migrant crisis.
21. Researchers found migrant workers living in squalid, overcrowded accommodation (doc#3519964) 22. Indeed, one likely consequence would be to divert yet more jobs to unscrupulous firms employing migrant labour at rock-bottom wages. (doc#1895565) Looking at the data from 2015-16, migrant worker and migrant labour/er(s) are still found, but the noun most frequently modified by MIGRANT is crisis.Migrant crisis occurs 685 times in the material from January 2015 to March 2016, compared to only 16 in the earlier, larger subset.There are no instances in the BNC.Taking into account the difference in corpus size, migrant crisis appears to be about 150 times more frequent in the later OEC material.Other differences between the 2012-4 and 2015-6 data is that camp and benefit are more frequent in the later material.
23.The migrant crisis in Europe has dominated headlines for much of 2015.
(doc#2503725) 24.Cameron has been advised that some limits on migrant benefits may require changes to the EU 's treaties.(doc#3678241) In the BNC, REFUGEE is found modifying the nouns camp/s and child/ren.These are also found in the OEC in both sub-sets.Camp is the most frequent noun in the earlier material, used more than four times as much as the next most frequent (agency).It is also frequent in the 2015-6 data, but not found as often as refugee crisis, which can be seen 938 times, compared to 137 occurrences in the larger 2012-14 set.Other words used with REFUGEE, in both sub-sets of the OEC, relate to the reception of refugees, such as agency, status, centre.The British Prime Minister David Cameron has been heavily criticized for his use of language when referring to groups of migrants, using the phrases "a swarm of people" and "bunch of migrants" (see, for example, Elgot and Taylor 2015;Freedland 2016).Looking at what words are used for groups of refugees and migrants, we find that there are considerable similarities between REFUGEE and MIGRANT.Among the most common patterns are uses with seemingly neutral words like number and group.Other include words relating to movement of water: flow, influx, wave, flood, and surge.There are few instances where bunch or swarm are used, other than those that refer to the Prime Minister's usage.
27. Greece struggles to cope with large flows of refugees driven to the European Union by war and (doc#3689596) 28.European countries have taken in waves of migrants fleeing violence.Germany allowed 20,000 (doc#3743191) 29.Asked if he considered the phrase "bunch of migrants" to be pejorative the spokesman said " (doc#7422058) Where patterns including verbs are concerned, the BNC material suggests that refugees are found to flee and arrive, and they are helped or returned.There are no very frequent patterns including MIGRANT.In the OEC, frequencies are much higher overall so more co-occurrence patterns are found.Where REFUGEE is used as the object, many of the verbs can be seen to suggest that there is a willingness to assist and support refugees, such as help, accept, welcome, and support.The need for a home can also be seen, and problems finding somewhere to stay is expressed with verbs such as resettle, house, relocate, accommodate, and host.
30.But more than this, people are taking control.One speaker asked the crowd gathered outside the historic Sheldonian Theatre: 'How many of you are ready to welcome and help refugees here?'A sea of hands went up.(doc#3028450) 31.Spain said it was ready to accept as many refugees as the Commission proposes, reversing course after saying it was being asked to take too many.
(doc#3624184) 32.Thousands have offered to house refugees in their own homes, signing up to an online (doc#7408738) MIGRANT is found with verbs that may illustrate a divided attitude to migrants and how to receive them or not, for example stop, allow, prevent, accept, and deter.Other frequent verbs indicate that migrants, just like refugees, need help and support, for example rescue and help.
There are but small differences between the earlier and later material where the use of the two words as objects is concerned.
Looking at REFUGEE and MIGRANT as subjects, some interesting patterns can be seen.Refugees flee, live, face, cross, leave, seek and escape -verbs that may be seen illustrative of the plight of people forced to leave their homes and seek a new existence.Where MIGRANT is concerned, many instances are used with verbs that mainly show that they move, such as arrive, come, enter, and cross.There are also examples of verbs that can be used to illustrate the difficult journey, such as try, die, flee, and attempt.The range of verbs found with the two terms does not differ much between the sub-sets of the corpus; the main difference is in the frequency of occurrences, with a considerable increase in the frequency of constructions including MIGRANT and REFUGEE in the material from 2015-16.

Concluding remarks
As noted at the beginning of this paper, it has been suggested that there is a clear difference between the terms REFUGEE and MIGRANT (for example Edwards 2015).A migrant is someone who chooses to move in order to find work or a better life, while a refugee is forced to move because of threat to life or freedom.A more detailed study of more examples would be needed to determine exactly how the two terms are used today, but the current overview suggests that this distinction is largely maintained.Phrases such as migrant labour, economic migrant match the idea that MIGRANT refers to people moving for work or in search of a different existence, while patterns including REFUGEE more often refer to someone who is fleeing or needing help.What is interesting to note, however, is that this distinction may have become slightly less evident recently, as the European Migrant Crisis becomes more pronounced and the focus is drawn to the difficulty and suffering of people on the move and the effect of the mass-movement on Europe.Further observation of developments is planned and it is believed that the future studies, made possible by the existence of the monitor corpus OEC, will cast a light not only on new developments but also on the factors that affect the use of the two terms today.

Figure 1 .
Figure 1.Variation in relative frequency per month 9 33.More than half a million migrants and refugees fleeing war and poverty in the Middle East (doc#3603608) 34.More than half a million migrants have entered the European Union this year (doc#3688134)

Table 1 .
Frequencies and proportions of MIGRANT and REFUGEE in the BNC.As the number of former South Vietnamese army majors and persecuted intellectuals drifted down to infinitesimal proportions and the number of economic migrants grew, the United States looked the other way.(AA1 28) 8. Illegal migrants who hide from authorities often became victims of crime or criminals, he said.(K5D 3067) Among the plural instances of REFUGEE, a relatively large proportion is used in the name The United Nations High Commission for Refugees (126 instances 4 ).Although the phrase economic refugees is found, it is rare (8 instances, compared to 33 instances used with the less frequent migrants).A number of instances of REFUGEE are preceded by words denoting nationalities, including Kurdish, Somali, Vietnamese, Bosnian, and Iraqi.The number of genuine refugees is three times higher than the number of bogus refugees(34 compared to 11) 15.The United Nations High Commission for Refugees in Geneva called the British action 'premature' while other avenues were still open.(A9W420)16.Some are economic refugees who came to Britain seeking a better life; the rest are trying to escape oppression and even torture.(A59711)17.A group of Bosnian refugees has said a tearful farewell to charity workers after spending a week in their care.(K22922)Lexicalverbs used with REFUGEE include forms of flee, help, arrive, seek and return, all natural to the situation in which refugees find themselves.18.During the whole of the 1980s, refugees fled from the wars in Chad and MIGRANT and REFUGEE are not only used as modifiers of nouns, but can also be modified themselves.Looking at the most frequent patterns, it seems obvious that the two words have different uses.REFUGEE is often found with words denoting someone's origin: Syrian, Palestinian, Jewish.Other modifying adjectives include political, vulnerable, and genuine.Although MIGRANT is also used with words referring to someone's origin, such as African, European, and Romanian, other frequent uses include terms relating to their status: illegal, economic, skilled, temporary.A difference between the earlier and later OEC material is that in the 2015-16 material, both REFUGEE and MIGRANT is used frequently with the adjective desperate.Many and more are also found among the ten most frequent modifiers for both words in the newer material.