Rethinking the linear genealogy of Bangla

Silver coin of Danujamarddana Deva, circa 1417. The obverse of the coin is inscribed as ‘Sri Sri Danuja Marddana Deva’. The reverse of the coin is inscribed as ‘Sri Chandi Charana Parayana’.

As Darwin said, the linguistic system of modern humans has the "power of associating together the most diversified sounds and ideas" to produce an infinite number of sentences. This system evolved in the ancestors of modern humans in Africa sometime between 100 and 70 thousand years ago. Although all humans are genetically endowed with the same internal linguistic system, their languages are diversified in their inventory of sounds, sound patterns, word formations, and sentence constructions. Various factors play roles in the diversification of a language. The primary factor is the divergence of a speech community. In that light, human migration from Africa to different parts of the world is the original reason for the diversity of languages in today's world.

Human migration can be traced back to the prehistoric period by analyzing fossils and other archeological findings. Still, we could only go back to 7-6 thousand years ago in terms of the linguistic history of our languages since languages do not leave fossils. Currently, there are roughly 6,500 languages spoken around the world. They are grouped into language families based on their genealogical relationships. However, the family relationship represented by a family tree often poses some problems in understanding the nature of the origin of a language. This essay will review the origin of Bangla, which is traditionally represented in the tree model.

Silver coin with proto-Bengali script, Harikela Kingdom, circa 9th–13th century. Photo courtesy: Biswarup Ganguly

Bangla is a set of closely related dialects natively spoken by the ethnic group who identify themselves as Bengalis. All the dialects share their family or the last name as Bangla, and their first names differ based on historical, geographical, literary, sociological, and other factors. The first name of a contemporary regional dialect is made up of the name of the region it is spoken in plus either the suffix -i/-ia or the possessive case marker -er, both of which mean "belongs to." For instance, the dialect spoken in the Barisal district of Bangladesh is Barisailla (<Barisal+ia) Bangla, the dialect spoken in Medinipur Division, West Bengal, India is Medinipuri Bangla, and the dialect spoken in the Comilla district of Bangladesh is Comillar (Comilla+er) Bangla. Apart from the regional dialects, there are a couple of standard dialects of Bangla, one used in Kolkata, India, and the other in Bangladesh. Following Suniti Kumar Chatterji, we can understand all these Bengali dialects, except the literary dialect "Sadhu Bhasha," emanating independently from a common source. It goes against the widespread belief that the West Bengali dialects descend from the Sadhu Bhasha, which according to Chatterji, is a "composite speech" based on the West Bengali dialects (Chatterji 1926).

Now the question is, if all the dialects of Bangla flow independently from a common language, what is the name of that language? Following the tree model of language genealogy, we would tree them as Proto-Bangla→ Old Bangla→Middle Bangla→ Modern Bangla dialects. However, it is difficult to imagine that Old and Middle Bangla were homogeneous dialects. We can rewrite the labels as Old Bangla dialects→ Middle Bangla dialects→ Modern Bangla dialects. Then again, this history of Assamese can be represented as Old Assamese dialects→ Middle Assamese dialects→ Modern Assamese dialects. But Chatterji (1926) maintains that "the agreement between Assamese and Bengali (Bangla) is so close that the dialects of Bengali and Assamese may be described as belonging to the same group." So, what could be the name of that group of dialects? Is he referring to the Bengali-Assamese language? No. He opines that Assamese could be associated with North Bengali dialects since the earliest documented Assamese of the mid-15th century is "practically identical" with contemporary North and Eastern literary Bengali. His categorization is based on comparative methods, the best practices of language genealogy. However, it creates a problem for the tree model. His analysis suggests that the so-called Bengali-Assamese is a dialect continuum, and these dialects often intersect in terms of close agreements. 

Chatterji's "agreements" are understood as shared innovation among dialects in modern parlance. Shared innovations are the most accepted method of grouping dialects into subgroups. We put them in a sub-group where some genealogically related dialects share common traits in their phonology, morphology, and grammar, which they did not inherit from their parent language. Usually, speakers of dialects in such a group have a higher degree of mutual intelligibility than speakers of dialects from a different group. However, in the context of a dialect continuum, it is often difficult to draw a tree diagram to represent the language change and birth of new dialects from a single dialect. For instance, it is challenging to categorize a dialect between Bangla and Assamese in terms of shared innovation and mutual intelligibility. Likewise, drawing a tree diagram will be problematic if an idiom is born due to dialectal contact between Bangla and Assamese.

Bhashacharya Acharya Suniti Kumar Chatterjee (1890 – 1977)

From a socio-political view, grouping dialects into a common name is often triggered by ethnic, national, and other considerations. For example, to some British administrators of India, Assamese was a dialect of Bangla. The Assamese speakers had to mobilize against that categorization to reclaim the identity of their language. Chatterji (1926) notes that Assamese dissociates from Bengali dialects when the speakers of Bengali acknowledge the supremacy of a literary dialect of Bengali. Likewise, the popular perception in Bangladesh is that Sylheti, Chittagonian, and Rongpuri are dialects of Bangla. For many Sylhetis, Chittagonians, and Rongpurias, their native tongues are languages in their own right. Linguistically speaking, the degree of mutual intelligibility between Sylhelti and Standard colloquial Bangla or Chittagonian and Standard colloquial Bangla is remarkably low. Many linguists concur that Sylheti, Chittagonian, and Rongpuri are separate languages; they are different sets of dialects. However, it isn't easy to point to a language that Bangla, Assamese, Chitagonian, Sylheti, and Rongpuri sprung from other than the Bengali-Assamese dialect continuum. In a tree diagram, the internal structure of the Bengali-Assamese language is a flat tree. The evidence for a Bengali-Assamese language is some place names whose phonology looks very similar. It is a stretch to construct or conceptualize a homogenous language from some place names. There was a literary dialect or lingua franca in Bengali-Assamese stock, but the regional dialects were in a continuum with various degrees of mutual intelligibility. In that case, the question remains whether that lingua franca gave birth to dialects that came later or they were born from the continuum. A tree model of language genealogy may not represent such complexities.

Sometimes the tree may even contribute to the confusion about the genealogy of a language. Historical linguistics has traditionally categorized modern languages into three-phase: the Old, the Middle, and the Modern. The oldest form of a Modern language is usually a dialect of a proto-language. For example, the old Bangla is a dialect of the Bangla-Assamese language. Now that Sylhetis, Chittagonians, and Ronpurias are regarded as languages separate from Bangla, finding their ancestral history along that line becomes necessary. The problem is the literary tradition of courts of the Middle age favored the literary dialects, and those dialects were the lingua franca of the greater Bengal.

A 17th-century birch bark manuscript of Paini’s grammar treatise ‘Ashtadhyayi’. The text was originally written between the 6th and 5th century BCE.

On the other hand, the diversity of oral dialects was not recorded throughout history. As a result, it is often challenging to amass evidence for Old and Middle Sylheti, Chittagonian, and Rongpuri. All the literary dialects we find in various literary works of the Middle age, irrespective of their variety of different sorts, are generally referred to as the "Moddho Juger Bangla," i.e., the Bangla of the Middle period.

It is a recurrent problem throughout the history of categorizing and naming a language and determining its descendants. We can imagine the same complexity regarding the language called Magadhi Prakrit. Magadhi Prakrit is not a homogeneous language. What happens to the dialects that fall in the Middle of the Saurasini and Magadhi spectrum often need to be clarified. Where do we put them in a tree diagram? To a Bengali speaker, a Bhojpuri speaker's speech is more related to Hindi than Bengali. Many Bhojpuri speakers find their language more associated with Hindi for socio-cultural and political reasons. In addition, there is a noticeable influence of Hindi on Bhojpuri. All these nuances are trimmed when we retrospectively draw a tree diagram. The popular belief is that the modern Indo-Aryan languages came from the dialects that are preserved in the literary or religious texts; for example, Bangla, Hindi, and Marathi came from Vedic Sanskrit via Classical Sanskrit via Dramatic Prakrits. But what tells us that the speakers of early Indo-Aryans in Ancient India spoke a homogenous language?

The Indo-Aryan or Indic language speakers came to India during the late Indus Valley Civilization, approximately 1700 BCE. The earliest evidence of their language is recorded in Rigveda, composed around 1500 BCE. Since Aryan migration occurred in successive bouts, it is difficult to ascertain that all Indo-Aryan people of Rigvedic time spoke in the Rigvedic language only. Asko Parpola points out that the intrusion of non-Rigvedic dialectal forms of the Indo-Aryan language and of new subject matter into the latest book of Rigveda indicate the mixing of Rigvedic and non-Rigvedic Indo-Aryans in India (see Parpola 1999). Besides, many experts concur that Vedic Sanskrit is not a homogenous language, or it looks homogenous compared to the Middle Iranian of Indo-Iranian branch (see Gombrich 2006 and Witzel 1989). Some linguists maintain that Post-Rigvedic Sanskrit and Prakrit are two poles of the Indo-Aryan dialectal spectrum (see Deshpande 1993b). Rigvedic language and Sanskrits in the Vedic era are primarily liturgical languages. And the tradition of composing Vyākaraṇa or grammar was inspired by the need to learn and understand the Vedas correctly. Following the same tradition, Panini composed his vyākaraṇa (grammar) of the "bhāshā" (the language) used by the Śiṣṭas (the nobles) of his time. During Panini's time, there were many non-standard Indo-Aryan languages that he called apabhāshā (bad languages?). Panini's distinction between bhāshā spoken by nobles and apabhāshā allow us to assume that the Old Prakrit predates classical Sanskrit. All these lead us to believe that the internal diversification of the Ino-Aryan language occurred in the context of the dialectal continuum and a contact: contact among the Rigvedic and Non-Rigvedic dialects of the Indo-Aryan language and contact between Indo-Aryan and Non-Indo Aryan or local languages.

A map showing the distribution of Bengali dialects while taking into account political boundaries.

The perception that Old Prakrits descend from the Vedic Sanskrit clouds the genealogy of Middle and new Indo-Aryan languages in several ways. It neglects the existence of pre-Rigvedic Indo-Aryans plus contemporary non-Rigvedic Indo-Aryans and their variety of speeches. If not, it dubs all Indo-Aryan speeches of the Vedic period as Sanskrits, irrespective of their variance. Pollock points out that Vedic Sanskrits are primarily liturgical dialects of Indo-Aryan speech. It was not accessible to commoners for more than a century.

Sanskrit becomes cosmopolitan at the beginning of the common ear. The name Sanskrit itself suggests that it is a language that has been perfected. We can recall the comment of Chatterji regarding the literary Bangla called "Sadhu Bhasa." The name Sadhu Bhasha means a chaste language. One can argue that the idea behind these names, Sanskrit "Perfected" and Shadhu "Chaste," comes from their origin and usage. Sadhu Bhasa is codified based on West Bengali dialects to write creative and legislative literature. Soon it became the medium of education.

However, experts concur that Sadhu Bhasha was never a spoken language. No evidence is found to support that Vedic Sanskrit was ever a spoken language. However, the same cannot be said about Classical Sanskrit. In pre-modern India, Sanskrit became a cosmopolitan language of power. We can compare it with the Standard Colloquial Bangla to some extent. Standard Colloquial Bangla also started as a literary language based on West Bengali dialects. Through literature, medium education, legislative use, and media use, Standard Bengali became the first language of many new-generation speakers of Bangladesh. In East Bengal, it has no local roots.

Finally, the linear genealogy represented in the tree diagrams fails to accommodate the Sanskritization time and again to Non-Sanskrit literary dialects of Aryan speeches in India. For instance, much discussion happened about the colonial project that codified the literary dialects of Bangla, namely Sadhu Bangla and Calito Bangla (Standard Colloquial Bangla). The prose writers of the colonial era Sanskritized them. However, in the pre-colonial era, Sanskritization was not absent. Many Pre-colonial court poets were well-versed in Sanskrit, and the Sanskrit language was also promoted by many courts of that time. These poets used literary Bangla dialects that appear to be much more Sanskritized than Chitagonian and Sylheti and the early evidence of the Prakrit origins of Bangla, for example, verses of the Charyapadas that are composed by Siddhas poet-devotees of Sahajia sect of Buddhism from Bengal. We need a better model to capture these complexities of new Indo-Aryan languages that are traditionally grouped into Bengali-Assamese stock.

Ahmed Shamim is an Assistant Professor of Instruction, Asian Studies, the University of Texas at Austin.


সিরাজুল আলম খান
১০ ঘণ্টা আগে|শ্রদ্ধাঞ্জলি

‘কাপালিক’ থেকে ‘দাদা ভাই’: রহস্যময় সিরাজুল আলম খান

তিনি ছিলেন স্বাধীন বাংলাদেশ গড়ে তোলার স্বপ্ন-সংগ্রাম ও স্বাধীনতা পরবর্তী রাজনৈতিক মেরুকরণের অন্যতম নিয়ামক শক্তি। তিনি যেভাবে চেয়েছেন, যা করতে চেয়েছেন, তাই করেননি—করিয়েছেন।