Languages of India

Map of South Asia in native languages.

The languages of India primarily belong to two major linguistic families, Indo-European (whose branch Indo-Aryan is spoken by about 75 percent of the population) and Dravidian (spoken by about 25 percent). Other languages spoken in India come mainly from the Austro-Asiatic and Tibeto-Burman linguistic families, as well as a few language isolates. Individual mother tongues in India number several hundred, and more than a thousand if major dialects are included.[1]The SIL Ethnologue lists 415 languages for India; 24 of these languages are spoken by more than a million native speakers, and 114 by more than 10,000. Three millennia of political and social contact have resulted in mutual influence among the four language families in India and South Asia. Two contact languages have played an important role in the history of India: Persian and English.[2]


While Hindi is the official language of the central government in India, with English as a provisional official sub-language, individual state legislatures can adopt any regional language as the official language of that state. The Constitution of India recognizes 23 official languages, spoken in different parts of the country, and two official classical languages, Sanskrit and Tamil.

Official Languages

While Hindi is the official language of the central government in India, with English as a provisional official sub-language, individual state legislatures can adopt any regional language as the official language of that state. In effect, there are "Official Languages” at the state and central levels but there is no one "national language."

Article 346 of the Indian Constitution recognizes Hindi in Devanāgarī script as the official language of central government India. The Constitution also allows for the continuation of use of the English language for official purposes. Article 345 provides constitutional recognition as "Official languages" of the union to any language adopted by a state legislature as the official language of that state. Until the Twenty-First Amendment of the Constitution in 1967, the country recognized fourteen official regional languages. The Eighth Schedule and the Seventy-First Amendment provided for the inclusion of Sindhi, Konkani, Manipuri and Nepali, increasing the number of official regional languages of India to 18. Individual states, whose borders are mostly drawn on socio-linguistic lines, are free to decide their own language for internal administration and education. In 2004, the government elevated Tamil.[3] to the newly created official status of "Classical Language," followed by Sanskrit in 2005.[4]

The Constitution of India now recognizes 23 languages, spoken in different parts the country. These consist of English plus 22 Indian languages: Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Maithili, Malayalam, Meitei, Marathi, Nepali, Oriya, Punjabi, Sanskrit, Santhali, Sindhi, Tamil, Telugu and Urdu. Hindi is a official language of the states of Uttar Pradesh, Bihar, Jharkhand, Uttaranchal, Madhya Pradesh, Rajasthan, Chattisgarh, Himachal Pradesh, Haryana and the National Capital Territory of Delhi. Tamil is an official language of Tamil Nadu, Puducherry and Andamon Nicobar Islands. English is the co-official language of the Indian Union, and each of the several states mentioned above may also have another co-official language.

The following table lists the 22 Indian languages set out in the eighth schedule as of May 2007, together with the regions where they are used:

No. Language Place(s)/Community
1. Assamese/Asomiya Assam
2. Bengali/Bangla Andaman & Nicobar Islands, Tripura, West Bengal,
3. Bodo Assam
4. Dogri Jammu and Kashmir
5. Gujarati Dadra and Nagar Haveli, Daman and Diu, Gujarat
6. Hindi Andaman and Nicobar Islands, Arunachal Pradesh, Bihar, Chandigarh, Chhattisgarh, the national capital territory of Delhi, Haryana, Himachal Pradesh, Jharkhand, Madhya Pradesh, Rajasthan, Uttar Pradesh and Uttarakhand.
7. Kannada Karnataka
8. Kashmiri Jammu and Kashmir
9. Konkani Goa, Karnataka,
10. Maithili Bihar
11. Malayalam Kerala, Andaman and Nicobar Islands, Lakshadweep
12. Manipuri (also Meitei or Meithei) Manipur
13. Marathi Dadra & Nagar Haveli, Daman and Diu, Goa, Maharashtra
14. Nepali Sikkim, West Bengal
15. Oriya Orissa
16. Punjabi Chandigarh, Delhi, Haryana, Punjab
17. Sanskrit Listed as a Classical Language of India.
18. Santhali Santhal tribals of the Chota Nagpur Plateau (comprising the states of Bihar, Chattisgarh, Jharkhand, Orissa)
19. Sindhi Sindhi community
20. Tamil Tamil Nadu, Andaman & Nicobar Islands, Kerala, Puducherry . Listed as a Classical Language of India.
21. Telugu Andaman & Nicobar Islands, Andhra Pradesh
22. Urdu Andhra Pradesh, Delhi, Jammu and Kashmir, Uttar Pradesh, Tamil Nadu

Hindi and English

The Indian constitution declares Hindi in Devanagari script to be the official language of the union.[5] Unless Parliament decided otherwise, the use of English for official purposes was officially to cease after the constitution came into effect, on January 26, 1965.[6] The prospect of the changeover led to much alarm in the non Hindi-speaking areas of India, as a result of which Parliament enacted the Official Languages Act, 1963, providing for the continued use of English for official purposes along with Hindi, even after 1965. An attempt was made in late 1964 to expressly provide for an end to the use of English, but it was met with protests from across the country, some of which turned violent. Widespread protests occurred in states such as Tamil Nadu, Kerala, West Bengal, Karnataka, Pondicherry and Andhra Pradesh. As a result of these protests, the proposal was dropped,[7] and the Act itself was amended in 1967 to provide that the use of English would not be ended until a resolution to that effect was passed by the legislature of every state that had not adopted Hindi as its official language, and by each house of the Indian Parliament.

Language Families

The languages of India may be grouped by major language families. The largest of these families in terms of speakers is the Indo-European family, predominantly represented in its Indo-Aryan branch (accounting for some 700 million speakers), but also including minority languages such as Persian, Portuguese or French, and English spoken as lingua franca. The second largest is the Dravidian family, accounting for some 200 million speakers. Minor linguistic families include the Munda with approximately nine million speakers, and Tibeto-Burman families with approximately six million speakers. There is also a language isolate, the Nihali language.

History of Languages in India

A bazaar in Andhra Pradesh with signs, from left to right, in Urdu, Hindi, Arabic, and English.
Language families in South Asia

The northern Indian languages from the Calestini family evolved from Old Indo-Aryan such as Sanskrit, by way of the Middle Indo-Aryan Prakrit languages and the Apabhramsha of the Middle Ages. There is no consensus on the specific time when the modern north Indian languages such as Hindi, Marathi, Punjabi, and Bengali emerged, but 1000 C.E. is commonly accepted. The development of each language was influenced by social and political contact with foreign invaders and speakers of the other languages; Hindi/Urdu and closely related languages were strongly influenced by Persian and Arabic.

The South Indian (Dravidian) languages had a history independent of Sanskrit. The origins of the Dravidian languages, as well as their subsequent development and the period of their differentiation, are unclear, and adequate comparative linguistic research into the Dravidian languages is lacking. Inconclusive attempts have also been made to link the family with the Japonic languages, Basque, Korean, Sumerian, the Australian Aboriginal languages and the unknown language of the Indus valley civilization. However, in later stages, all the Dravidian languages were heavily influenced by Sanskrit. The major Dravidian languages are Telugu, Tamil, Kannada and Malayalam.

Bengali arose from the eastern Middle Indic languages of the Indian subcontinent. Magadhi Prakrit, the earliest recorded spoken language in the region, had evolved into Ardhamagadhi ("Half Magadhi") in the early part of the first millennium C.E.. Ardhamagadhi, as with all of the Prakrits of North India, began to give way to what are called Apabhramsa languages just before the turn of the first millennium. The local Apabhramsa language of the eastern subcontinent, Purvi Apabhramsa or Apabhramsa Abahatta, eventually evolved into regional dialects, which in turn formed three groups: the Bihari languages, the Oriya languages, and the Bengali-Assamese languages. Some argue for much earlier points of divergence, going back to as early as 500 C.E., but the language was not static; different varieties coexisted and authors often wrote in multiple dialects.

The Austroasiatic family of languages includes the Santal and Munda languages of eastern India, Nepal, and Bangladesh, along with the Mon-Khmer languages spoken by the Khasi and Nicobarese in India and in Myanmar, Thailand, Laos, Cambodia, Vietnam, and southern China. The Austroasiatic languages are thought to have been spoken throughout the Indian subcontinent by hunter-gatherers who were later assimilated first by the agriculturalist Dravidian settlers and later by the Indo-Europeans from Central Asia. The Austroasiatic family is thought to have been the first to be spoken in ancient India. Some believe the family to be a part of an Austric superstock of languages, along with the Austronesian language family.

According to Joseph Greenberg, the Andamanese languages of the Andaman Islands and the Nihali language of central India are thought to be Indo-Pacific languages related to the Papuan languages of New Guinea, Timor, Halmahera, and New Britain. Nihali has been shown to be related to Kusunda of central Nepal. However, the proposed Indo-Pacific relationship has not been established through the comparative method, and has been dismissed as speculation by most comparative linguists. Nihali and Kusunda are spoken by hunting people living in forests. Both languages have accepted many loan words from other languages, Nihali having loans from Munda (Korku), Dravidian and Indic languages.

Classical Languages of India

In 2004, a new language category was created by constitutional decree, under which languages that met certain requirements could be accorded the status of a 'classical language' in India.[3] Upon the creation of this category, Tamil and, a year later, Sanskrit, were accorded the status, and more languages are under consideration for this classification. Experts consulted by the government and the Sahitya Academy of India, a literary body, recommended against officially awarding the status of "classical" to any language.

The government has declared Tamil a classical language despite the objections of experts it consulted and after a committee it had appointed refused to recommend it…. The Sahitya Akademi office bearers wrote a second time. In essence, they repeated that it was not the government's business to declare a language classical. It is a classically foolish move, a source said.[8]

In the mid-nineteenth century, Indologists referred to Paninian Sanskrit as "classical Sanskrit," distinguishing it from the older Vedic language.[9][10][11] Robert Caldwell, the first linguist to systematically study the Dravidian languages as a family, used the term "classical" to distinguish the literary forms of Kannada, Tamil, Telugu and Malayalam from the diglossic colloquial forms.[12] In the second half of the twentieth century, academics began to suggest that the Old Tamil poems of the Sangam anthologies were also "classical" in the sense that they shared many features with literatures commonly accepted as classical. This point, first made by Kamil Zvelebil in the 1970s,[13] has since been supported by a number of other scholars,[14][15][16] and the terminology "classical Tamil" is commonly used in historical literature to refer to texts from that period.[17] Martha Ann Selby argues that if classicality is defined with reference to age and the value a literature has within the tradition it represents, the Tamil poetry of the Sangam anthologies and the Maharashtri poems of the Sattisai are "classical," in addition to Sanskrit literature.[18]

Writing Systems

Indian languages have corresponding distinct alphabets. The two major families are those of the Dravidian languages and those of the Indo-Aryan languages, the former largely confined to the south and the latter to the north. Urdu and sometimes Kashmiri, Sindhi and Panjabi are written in modified versions of the Arabic script. Except for these languages, the alphabets of Indian languages are native to India. Most scholars consider these Indic scripts a distant offshoot of the Aramaic alphabet, although there are differing opinions.

Brahmic Script

Brahmic scripts are descended from the Brāhmī script of ancient India, which may have had a common ancestor with European scripts. However, some academics (see references in Rastogi 1980:88-98) believe that the Vikramkhol[19][20][21] inscription is conclusive evidence that Brahmi had indigenous origins, probably from the Indus Valley (Harappan) script.

The most prominent member of the family is Devanagari, which is used to write several languages of India and Nepal, including Hindi, Konkani, Marathi, Nepali, Nepal Bhasa and Sanskrit. Other northern Brahmic scripts include the Eastern Nagari script, which is used to write Bengali, Assamese, Bishnupriya Manipuri, and other eastern Indic languages, the Oriya script, the Gujarāti script, the Ranjana script, the Prachalit script, the Bhujimol script and the Gurmukhi script. The Dravidian languages of southern India have Brahmic scripts that have evolved making them suitable to southern needs. The earliest evidence for Brahmi script in South India comes from Bhattiprolu in Guntur district of Andhra Pradesh. Bhattiprolu was a great centre of Buddhism during third century C.E. and from where Buddhism spread to east Asia. The present Telugu script is derived from 'Telugu-Kannada script', also known as 'old Kannada script', owing to its similarity to the same.[22] Initially minor changes were made which is now called Tamil brahmi which has far fewer letters than some of the other Indic scripts as it has no separate aspirated or voiced consonants. Later under the influence of Granta vetteluthu evolved which looks similar to present day malayalam script. Still further changes were made in nineteenth and twenrieth centuries to make use of printing and typewriting needs before we have the present script.

Burmese, Cambodian, Lao, Thai, Javanese, Balinese and Tibetan are also written in Brahmic scripts, though with considerable modification to suit their phonology. The Siddham (kanji: 悉曇, modern Japanese pronunciation: shittan) script was especially important in Buddhism because many sutras were written in it, and the art of Siddham calligraphy survives today in Japan.


Chalipa panel, Mir Emad.

Nasta`līq (also anglicized as Nastaleeq; نستعلیق nastaʿlīq), one of the main genres of Islamic calligraphy, was developed in Iran in the fourteenth and fifteenth centuries. A less elaborate version of Nastaʿlīq serves as the preferred style for writing Persian, Pashto and Urdu. Nastaʿlīq is amongst the most fluid calligraphy styles for the Arabic alphabet. It has short verticals with no serifs, and long horizontal strokes. It is written using a piece of trimmed reed with a tip of 5–10 mm, called "qalam" ("pen," in Arabic), and carbon ink, named "davat." The nib of a qalam is usually split in the middle to facilitate ink absorption.

Example showing Nastaʿlīq's proportion rules.[ 1 ]

After the Islamic conquest of Persia, Iranians adopted the Perso-Arabic script and the art of Arabic calligraphy flourished in Iran alongside other Islamic countries. The Mughal Empire used Persian as the court language during their rule over the Indian subcontinent. During this time, Nastaʿlīq came into widespread use in South Asia, including Pakistan, India, and Bangladesh. In Pakistan, almost everything in Urdu is written in the script, concentrating the greater part of the world usage of Nasta’līq there. In Hyderābād, Lakhnau, and other cities in India with large Urdu-speaking populations, many street signs are written in Nastaʿlīq. The status of Nastaʿlīq in Bangladesh used to be the same as in Pakistan until 1971, when Urdu ceased to remain an official language of the country. Today, only a few neighborhoods (mostly inhabited by Bihāris) in Dhaka and Chittagong retain the influence of the Persian and Nastaʿlīq.


The National Library at Kolkata romanization is the most widely used transliteration scheme in dictionaries and grammars of Indic languages. This transliteration scheme is also known as Library of Congress and is nearly identical to one of the possible ISO 15919 variants.The tables below mostly use Devanagari but include letters from Kannada, Tamil, Malayalam and Bengali to illustrate the transliteration of non-Devanagari characters. The scheme is an extension of the IAST scheme that is used for transliteration of Sanskrit.

अं अः
a ā i ī u ū e ē ai o ō au aṃ aḥ
ka kha ga gha ṅa ca cha ja jha ña
ṭa ṭha ḍa ḍha ṇa ta tha da dha na
pa pha ba bha ma ẏa ḻa ḷa ṟa ṉa
ya ra la va śa ṣa sa ha
unvoiced consonants voiced consonants nasals
unaspirated aspirated unaspirated aspirated
velar plosives k kh g gh
palatal affricates c ch j jh ñ
retroflex plosives ṭh ḍh
dental plosives t th d dh n
bilabial plosives p ph b bh m
glides and approximants y r l v
fricatives ś s h


The Indian census of 1961 recognized 1,652 different languages in India (including languages not native to the subcontinent). The 1991 census recognizes 1,576 classified "mother tongues" SIL Ethnologue lists 415 living "Languages of India" (out of 6,912 worldwide).

According to the 1991 census, 22 languages have more than a million native speakers, 50 have more than 100,000, and 114 have more than 10,000 native speakers. The remaining languages account for a total of 566,000 native speakers (out of a total of 838 million Indians in 1991).

The largest language that is not one of the 22 "languages of the 8th Schedule" with official status is the Bhili language, with some 5.5 million native speakers (ranked 13th by number of speakers), followed by Gondi (15th), Tulu (19th) and Kurukh (20th). On the other hand, three languages with fewer than one million native speakers are included in the 8th Schedule for cultural or political reasons: English (40th), Dogri (54th) and Sanskrit (67th).


  1. More than a thousand including major dialects. The 1991 census recognized "1576 rationalized mother tongues" which were further grouped into language categories; the 1961 census recognized 1,652 MOTHER TONGUES OF INDIA ACCORDING TO THE 1961 CENSUS, LANGUAGE IN INDIA. Retrieved November 29, 2017.
  2. Tej K. Bhatia, and William C. Ritchie, (eds.)(2006) "Bilingualism in South Asia." 780-807. In: Handbook of Bilingualism. (Oxford: Blackwell Publishing, ISBN 0631227350).
  3. 3.0 3.1 India sets up classical languages BBC News, September 17, 2004. Retrieved November 29, 2017.
  4. Sanskrit to be declared classical language The Hindu, October 28, 2005. Retrieved November 29, 2017.
  5. Article 343(1) Official language of the Union. Retrieved November 29, 2017.
  6. Articles 343. Official language of the Union. Retrieved November 29, 2017.
  7. Duncan B. Forrester, "The Madras Anti-Hindi Agitation, 1965: Political Protest and its Effects on Language Policy in India." Pacific Affairs 39 (1/2) (Spring - Summer 1966):19-36.
  8. Sujan Dutta, Classic case of politics of language The Telegraph, September 28, 2004. Retrieved November 29, 2017.
  9. William D. Whitney, "On the History of the Vedic Texts." Journal of the American Oriental Society 4 (1854): 245-261. via JSTOR. Retrieved November 29, 2017.
  10. William D. Whitney, "On the Main Results of the Later Vedic Researches in Germany" Journal of the American Oriental Society 3 (1853): 289-328. via JSTOR. Retrieved November 29, 2017.
  11. James Cowles Prichard, "Anniversary Address for 1848, to the Ethnological Society of London on the Recent Progress of Ethnology" Journal of the Ethnological Society of London 2(1850): 119-149. via JSTOR. Retrieved November 29, 2017.
  12. Robert Caldwell, A Comparative Grammar of the Dravidian Or South-Indian Family of Languages. (original 1913) (New Delhi: Asian Educational Services, Second AES reprint 1998. ISBN 8120601173), 30, 78-81.
  13. Kamil Zvelebil, Tamil Literature (Leiden: E.J. Brill, 1975. ISBN 9004041907), 5-21, 50-53.
  14. Takanobu Takahashi, Tamil Love Poetry and Poetics (Brill's Indological Library) (Leiden: E.J. Brill, 1995. ISBN 9004100423), 2.
  15. A.K. Ramanujan, Poems of Love and War from the Eight Anthologies and the Ten Long Poems of Classical Tamil (UNESCO Collection of Representative Works) New York: Columbia University Press, 1985. ISBN 0231051069), ix.
  16. E. Annamalai and Sanford B. Steever, (eds.) "Modern Tamil" in The Dravidian Languages (London: Routledge, 1998, ISBN 0415100232), 100-128. 100.
  17. See e.g. Burton Stein, "Circulation and the Historical Geography of Tamil Country." The Journal of Asian Studies 37 (1977): 7-261. via JSTOR. Retrieved November 29, 2017; Clarence Maloney, "The Beginnings of Civilization in South India." The Journal of Asian Studies 29 (3) (1970): 603-616. via JSTOR Retrieved November 29, 2017.
  18. Martha Ann Selby, Grow long, Blessed Night: Love Poems from Classical India (New York: Oxford University Press, 2000, ISBN 019512734X), 3-4.
  19. Vikramkhol, Angelfire. Retrieved November 29, 2017.
  20. Naresh Prasad Rastogi, Origin of Brāhmī Script: The Beginning of Alphabet in India (Varanasi: Chowkhamba Saraswatibhawan, 1980).
  21. Rock Painting and Lithography of Bikramkhol. Retrieved November 29, 2017.
  22. S. M. R. Adluri, Telugu Language and Literature, Figures T1a and T1b. Retrieved November 29, 2017.


  • Alam, Muzaffar. The languages of political Islam India, 1200-1800. Muzaffar Alam. Chicago: University of Chicago Press, 2004. ISBN 0226011003.
  • Annamalai, E. and Sanford B. Steever (eds.). "Modern Tamil." in The Dravidian Languages. London: Routledge, 1998), 100-128. 100. ISBN 0415100232.
  • Beames, John. A comparative grammar of the modern Aryan languages of India to wit, Hindi, Panjabi, Sindhi, Gujarati, Marathi, Oriya and Bangali. Delhi: Munshiram Manoharlal, 1966.
  • Bhatia, Tej K., and William C. Ritchie (eds.). "Bilingualism in South Asia." 780-807. In: Handbook of Bilingualism. Oxford: Blackwell Publishing, 2006. ISBN 0631227350.
  • Caktivēl, Cu. Tribal languages of India. Kothaloothu, Madurai District: Meena Pathippakam, 1976.
  • Caldwell, Robert. A Comparative Grammar of the Dravidian Or South-Indian Family of Languages. (original 1913) reprint ed. New Delhi: Asian Educational Services, Second AES reprint 1998. ISBN 8120601173.
  • Campbell, George. Specimens of languages of India. New Delhi: Asian Educational Services, 1986.
  • Farmer, Steve, Richard Sproat and Michael Witzel. "The Collapse of the Indus-Script Thesis: The Myth of a Literate Harappan Civilization." EVJS 11 (2) (Dec 2004).
  • Haldar, Gopal, and Tista Bagchi. Languages of India. New Delhi: National Book Trust, India, 2000. ISBN 8123729367.
  • Ramanujan, A.K. Poems of Love and War from the Eight Anthologies and the Ten Long Poems of Classical Tamil. UNESCO Collection of Representative Works. New York: Columbia University Press, 1985. ISBN 0231051069
  • Rastogi, Naresh Prasad. Origin of Brāhmī Script: The Beginning of Alphabet in India. Varanasi: Chowkhamba Saraswatibhawan, 1980.
  • Selby, Martha Ann. Grow long, Blessed Night: Love Poems from Classical India. New York: Oxford University Press, 2000. ISBN 019512734X.
  • Scharfe, Harmut. "Kharoṣṭhī and Brāhmī." Journal of the American Oriental Society. 122 (2) (2002):391-393.
  • Stevens, John. Sacred Calligraphy of the East, 3rd ed. Rev. Boston: Shambala, 1995. ISBN 1570621225.
  • Takahashi, Takanobu. Tamil Love Poetry and Poetics. Brill's Indological Library. (Leiden: E.J. Brill, 1995. ISBN 9004100423
  • Trail, Ronald L. Patterns in clause, sentence, and discourse in selected languages of India and Nepal. Summer Institute of Linguistics publications in linguistics and related fields, publication no. 41. Kathmandu: University Press, Tribhuvan University, 1973. ISBN 088312047X.
  • Zvelebil, Kamil. Tamil Literature. Leiden: E.J. Brill, 1975. ISBN 9004041907

External links

All links retrieved November 29, 2017.


New World Encyclopedia writers and editors rewrote and completed the Wikipedia article in accordance with New World Encyclopedia standards. This article abides by terms of the Creative Commons CC-by-sa 3.0 License (CC-by-sa), which may be used and disseminated with proper attribution. Credit is due under the terms of this license that can reference both the New World Encyclopedia contributors and the selfless volunteer contributors of the Wikimedia Foundation. To cite this article click here for a list of acceptable citing formats.The history of earlier contributions by wikipedians is accessible to researchers here:

Note: Some restrictions may apply to use of individual images which are separately licensed.