How difficult is it to learn English? Linguistic distance in Australia

I became interested in how immigration trends effect how migrants settle after I started working at the Department of Immigration. One thing which everyone has an opinion on, but rarely do you see any empirical evidence, is how and why people struggle to learn English. We know English language proficiency has a substantial impact on settlement. This is mainly through the ease of social engagement and earning ability in the labour market. The better English a migrant has, the more likely they will successful settle.

Recently economic researchers have been on the case about native languages. Everyone agrees age plays a strong role. The younger the better in terms of learning languages. An emerging candidate as another influential factor is “linguistic distance”. This is the range between a migrants native language and the language of the country they have migrated to.

This paper – The Costs of Babylon: Linguistic Distance in Applied Economics – establishes a framework to empirically measure linguistic distance. Ingo Isphording and Sebastian Otten create bilateral relationships for languages by using encyclopaedias, linguistic publications and factbooks to generate distance scores. From their abstract:

[T]he effect of linguistic distance in the language acquisition of immigrants is analysed using data from the 2000 US Census, the German Socio-Economic Panel, and the National Immigrant Survey of Spain. Across countries, linguistic distance is negatively correlated with reported language skills of immigrants.

Their research suggests there is something to linguistic distance. However lets take note. This is one index, in a relatively new field, attempting to quantify something very complex. Sometimes empiricism goes off the deep end and you end up arguing over figures that bear very little relation to the real world. Personally, I don’t think this is the case here and this seems like an area ripe for further research. With these caveats, I want to use the index to measure the average language distance of people migrating to Australia. As Australia’s immigration trends have shifted decisively away from Europe, whose languages are relatively closer to English than Asia, it is possible language distance has been increasing, making it harder for non-English speaking migrants to learn English.

From a public expenditure perspective, the government spends about $230m per year on English language provision, under the Adult English Migrant Program (AMEP) program. If it is getting harder to learn English, there are implications for this program.

Some care is needed to answer this question. Skilled migrants, regardless of where they come from, typically face language proficiency barriers to enter Australia. Therefore I am excluding them from the analysis as I assume they all speak English. This is not true in reality. About 15 per cent of the AMEP is made up of skilled migrants, typically spouses. I don’t have the unit data for AMEP enrolment so for ease of analysis, skilled migrants are excluded. I’m also going to assume humanitarian migrants have a much wide range of factors outside of linguistic distance which will impact their ability to learn English. For this quick and dirty examination, they are also going to be excluded from this analysis.

This leaves family migrants who are best suited to examine. Family migrants make up over 50 per cent of all AMEP participants and can act as a good general indicator. To find these family migrants and their language background, the Department of Immigration keeps historical migration records here. For this analysis, I used worksheet 3.3 – “Migration program outcome by stream and citizenship”. This allows me to exclude skilled and humanitarian migrants while also matching up family migrants from origin countries, giving a good proxy of the language they speak.

A few caveats. Because I’m assessing language distance, I have excluded those countries who speak English. The language distance index has 178 countries, meaning a very small minority of migrants were excluded as their language distance score is not unknown (2.2 per cent of migrants under the family stream from 1995-2012 were excluded). Finally, of course people who can already speak English will arrive from countries where English isn’t spoken. I’m only after a broad trend to assess my hunch, not a highly specific empirical data point, so I’m not too concerned about this. A more rigorous approach may include assumptions about how many people speak English from each country or use AMEP participant information to weight certain countries given the impact they have on expenditure. If 50 per cent of people from a European background spoke English but only 25 per cent from an Asian background did, this would skew any results.

To begin with, I merged the migration data with the linguistic data from the study:

Screen Shot 2014-01-31 at 8.46.09 PM

(Thanks to the authors who kindly sent me their full dataset)

“ldnd” is the linguistic score and the years go out to 2011-12. A total of 162 countries are included.  You can notice from this screenshot Austria (German) has a lower score than Argentina (Spanish). The lower the score, the closer the language is to English. The closest languages to English are Dutch, Norweign, Swedish, Danish and German. The most distant are Vietnamese, Somalian, Finnish, Turkish and Tamil.

Using the linguistic score and the number of arrivals by country of origin, I created a proportional average of the linguistic score for each year of the family migration stream:


(note: I’m not 100 per cent sure I’ve done this right given my limited statistics training but I’ve double checked my formulas and workings and as confident as I’m going to get)

This shows a steady increase in the linguistic difficultly for family migrants since the start of the 21st century. With a standard deviation of 0.37, the 2012 figure is about 1.5 standards above the mean for the 1996-2012 period. The main factor is the increase in Chinese and Vietnamese family migrants, who have a high linguistic distance score. In 2009-10, there were >10,000 Chinese family migrants, a three fold increase from 1997-98.

What we don’t know is what proportion of difficultly learning English can be attributed to linguistic distance. A previous working paper from Isphording shows a significant effect on literacy when age and linguistic distance are combined:

Screen Shot 2014-02-01 at 2.37.36 PM

Instead of putting a number on this factor, I’m just going to say it appears to be something policy-makers should be aware of.

What does this mean for the AMEP? Well thanks to increases in funding under both the Howard and ALP governments, total spending has increased over the last decade. In 2000, ~$93m was spent on the AMEP, resulting in $2,600 per migrant under the program. In 2012-13, $238m was spent, increasing per capita spending to a tick under $4,000. After accounting for inflation, that’s an increase of over $400 in real terms.

This is a good sign. The changing trends of family migration have changed the demographic make up of the AMEP program. The top three native languages in the program are Mandarin, Arabic and Vietnamese. These trends have increased the average distance between English and migrants native language, making it harder, on average, to learn English. However, if we assume additional dollars increase learning outcomes (a *major* assumption), the additional funding should be helping to either partially or full offset this increase in difficultly.

More research, particularly of the AMEP participant trends, would be required to better evaluate how linguistic distance affects English language literacy in Australia. Under the AMEP, the majority of the funding increase may have been to better help humanitarian entrants as opposed to family migrants. I’m unsure.

Language is getting harder for at least some migrants to Australia. I firmly believe settlement services such as English language classes are an integral part of why migration in Australia has been so different to other western democracies. Multiculturalism has thrived because people are well equipped within society, something underwritten by the English language ability of recent migrants. Hopefully when budgets are razored in the coming months, this is not forgotten by the Abbott government intent on saving dollars.


