On The Border Between Language and Dialect

There is a wealth of popular linguistic articles in print and online that claim to systematically explain what is meant by the terms, “language” and “dialect”. Many opt for the neat conclusion that a language is written and codified as well as spoken (standard), whereas a dialect is merely spoken (vernacular). But such dictionary-clinging definitions fail to reflect the lived experience of languages in all their varieties and nuance.

“A language is a dialect with an army or a navy”

Perhaps the clearest concession ever made on the matter was by the distinguished linguist, Max Weinreich (1894-1969), who has been immortalised through the phrase usually attributed to him that “a language is a dialect with an army or a navy”. Of course, this cannot be taken literally, but rather should be seen as a comment on the great political divisions between languages and dialects. Such divisions are often created in a way that is unconcerned with linguistic similarities or differences.

Such a politicised argument is well supported by the example of certain languages of the Balkans, namely Bosnian, Croatian, Montenegrin and Serbian. These four languages are all in fact mutually intelligible varieties of the same language, which some linguists like to call Bosnian-Croatian-Montenegrin-Serbian (or BCMS, for short). Despite the undeniably strong features of similarity between all four of these varieties of language, for political reasons, they are accepted as independent languages. Most pervasively, such a political rationale is based on national identity and regional conflict surrounding the delicate borders of the Balkans. This also accounts for the reason why Serbo-Croatian has often been seen historically as an independent entity to Bosnian, pointing out the sensitive power dynamics, and national associations in this region.

BCMS is therefore, like Bahasa or Hindustani, truly a single language with multiple codified varieties. Such languages are known as pluricentric languages.

But the argument put forward in this map that pluricentric languages like BCMS remain entirely the “same language” is a very weak one. It is no surprise that, owing to starkly different identities and a lack of cultural unity post-partition, varieties such as Serbian and Bosnian or Hindi and Urdu begin to take on noticeably different personalities. Ultimately, this results in the varieties growing further away from each other in time.

One way in which this occurs is through word borrowings from different languages, in order to further the creation efforts of an independent identity. For example, through the controlled addition of more Sanskrit loanwords in the Modern Hindi of India, to make the connection with the language’s Hindu roots more apparent, and an imposition of further Arabic and Persian loanwords in the Modern Urdu of Pakistan, to further Islamicise the language. These new words are often only used in more formal or intellectual settings. Notably, Hindi has the formal dhanyabad to mean ‘thank you’, instead of the more colloquial, and more Urdu, shukria. In the colloquial language of the majority, however, most of the basic vocabulary, as well as the grammar, has remained the same.

The Dialect Continuum

The theory of the so-called “dialect continuum” is relevant here. This highly logical theory, greatly backed by many dialectologists, asserts that although neighbouring varieties of a language might be mutually intelligible, over distance, minor differences become more pronounced, resulting in a language that could scarcely be understood by its various speakers from different regions. In short, the argument goes that as physical distance increases, so do linguistic differences. Arabic is one of the strongest examples to illustrate this theory. A speaker of Dārija Arabic from Morocco would barely be understood if they were to engage in a conversation with someone who speaks Baghdadi Arabic, for example. Here, the “language” they would typically use would be Modern Standard Arabic (MSA, usually referred to as al-Fuṣḥā), or perhaps an educated dialect aimed at mitigating the differences between dialects. Such a dialect is still heavily influence by MSA with only simplified features of regional dialects, though is considered more natural in speaking than MSA. The linguist El-Said Badawi (1929-2014) used the term, ‘the colloquial of the cultured’ (āmmiyyat al-muṯaqqafīn) to describe this dialect.


Diagram by Badawi of his thesis on how features of Standard Arabic decline as the dialect in Egypt becomes more colloquial. The top of the triangle is Classical Arabic (fuṣḥā at-turāth), followed by Modern Standard Arabic (fuṣḥā al-ʿaṣr), then ‘the colloquial of the cultured’ (āmmiyyat al-muthaqqafīn), followed by ‘the colloquial of the basically educated’ (ʿāmmiyyat al-mutanawwarīn), and at the bottom is the so-called ‘colloquial of the illiterates’ (āmmiyyat al-ʾummiyyīn).

Source: Badawi, Mustawayāt al-ʿarabiyya al-muʿāṣira fī miṣr 

[The levels of contemporary Arabic in Egypt], 1973.

The Arabic Question

The case of Arabic raises a particularly interesting query with Weinreich’s quip about a language being a dialect with an army and a navy. If the argument is to be accepted that it is political autonomy or a national identity that allows for dialects to become languages, why are such radically different varieties of Arabic, so different that they are mutually unintelligible, not classified as different languages? Indeed, Morocco has its own army and navy, does it not?

Again, we cannot afford to read such assertions too literally, as the cultural, religious, and social bonds that tie Arabic-speaking countries together cannot be undermined by modern ideas of nationhood and nationalism in North Africa and the Middle East. We must equally be careful not to ignore movements that are in fact calling for the recognition of such dialects as separate languages to Arabic. This has especially been the sentiment found amongst some predominately Christian communities in Lebanon, with distinguished figures such as the writer Said Aql (1911-2014) calling for the recognition of a Lebanese language.

Of course, such movements thus far have failed to result in the recognition of new languages, and the only Arabic dialect recognised as its own language remains Maltese. Nonetheless, such cases do highlight the intensely varied and contradictory perceptions from native speakers on which varieties constitute dialects or languages in their own right.

It is therefore clear that the distinction between languages and dialects is not linguistic. Rather, political factors are a key influence over whether a variety of language receives the designation of “dialect” or “language”. However, social and cultural identities also play a significant role in determining which terminology is used, especially in relation to the geographic and historical contexts of the communities of speakers.

The distinction between language and dialect lies beyond the jurisdiction of modern linguistics. Politicians and their societies decide how language varieties ought to be categorised. It is for this reason that the borders between languages and dialects are not only very murky, but are also incredibly porous, and are often even more changeable and arbitrary than the lines on a paper map.

