Hindustani etymology

Hindustānī, also known as Hindi-Urdu, comprises several closely related dialects in the northern, central and northwestern part of the Indian subcontinent. It encompasses two standardized registers in the forms of the official languages Hindi and Urdu, as well as several nonstandard dialects. Hindustani is not an immediate descendant of Sanskrit, but uses a large lexicon of loanwords.[1][2]

Standard Hindi derives much of its formal and technical vocabulary from Sanskrit while standard Urdu derives much of its formal and technical vocabulary from Persian and Arabic. Standard Hindi and Urdu are used primarily in public addresses and radio or TV news, while the everyday spoken language is one of the several varieties of Hindustani, whose vocabulary contains words drawn from Persian, Arabic, and Sanskrit. In addition, spoken Hindustani includes words from English and the Dravidian languages, as well as several others.

Hindustani developed over several centuries throughout much of the northern subcontinent including the areas that comprise modern-day India, Pakistan, and Nepal. In the same way that the core vocabulary of English evolved from Old English (Anglo-Saxon) but assimilated a large number of words borrowed from French and other languages (whose pronunciations often changed naturally so as to become easier for speakers of English to pronounce), what may be called Hindustani can be said to have evolved from Sanskrit while borrowing many Persian and Arabic words over the years, and changing the pronunciations (and often even the meanings) of these words to make them easier for Hindustani speakers to pronounce. A large number of Persian words entered the Hindustani lexicon due to the influence of the Turco-Mongol Mughal rulers of north India, who followed a very Persianised culture and also spoke Persian. Many Arabic words entered Hindustani via Persian, which had previously been assimilated into the Persian language due to the influence of Arabs in the area. The dialect of Persian spoken by the Mughal ruling elite was known as 'Dari', which is the dialect of Persian spoken in modern-day Afghanistan. Therefore, Hindustani is the naturally developed common language of north India. This article will deal with the separate categories of Hindustani words and some of the common words found in the Hindustani language.

Traditional categorization of Hindustani (Hindi-Urdu) words in Hindi pedagogy[edit]

Words in Hindustani are analyzed in traditional Hindi pedagogy as falling into the following categories:[3]

  • Tadbhava (तद्भव/تَدبھَو derived from): There are words that are derived from Sanskrit or Prakrit, but often with phonetic or morphological transformation.
  • Tatsama (तत्सम/تَتسَم identical): Words which are spelled exactly the same in written Hindi as they are in standard Sanskrit.
  • Deshaja (देशज/دیشَج local): Words that cannot be traced back to Sanskrit, and are of local origin.
  • Videshi (विदेशी/وِدیشی foreign): Loanwords from non-Indian languages that include Persian, Turkish, Arabic, Portuguese, or English.

The use of tatsama words was much less common in Apabhramsha.[citation needed] The most common words in Hindustani are tadbhava and are derived through Prakrit and Apabhramsha.[citation needed]

Urdu as spoken in Pakistan and some Indian states, and Hindi as spoken in India, other than using a different script, are often very similar. Generally Modern Standard Hindi incorporates vocabulary of Persian and Arabic origin found in Urdu, while Urdu has not been known to incorporate much of the vocabulary of Sanskrit origin found in Hindi.

Examples of Hindustani Word Derivations[edit]

Origin of āp (आप آپ), tum (तुम تم) and (तू تو)[edit]

In Hindustani, the pronoun āp, which denotes respect or formalism, originates from Sanskrit ātmana (आत्मन),[4] which denotes the higher self or level of consciousness.

The pronoun tum, which denotes informality or intimacy, originates from Prakrit tumma (तुम्म), ultimately from Sanskrit yushma (युष्म) (the base of the 2nd person plural pronoun).[5]

The pronoun , also originates from Prakrit tuhu (तुहुं).[6] In modern usage, is used widely in India to denote a wide range of attitudes, dependent on context, from extreme informality, to derision and even extreme reverence. However, usage of in most contexts is considered highly offensive in Urdu, with the exception being tu used as second-person pronoun for God. This is a trend comparable to the decline of "thou" in English and of "tu" in Brazilian Portuguese.

Origin of hai (है ہے)[edit]

One of the most common words in Hindustani is hai ("is"). It originates from the following developments:

Sanskrit s sometimes becomes h in Prakrits.

Shortening of ahai produced hai. In some older works in Hindustani literature, one can find usage of ahai. For example, Bharatendu Harishchandra wrote: "निज भाषा उन्नति अहै, सब उन्नति को मूल" ("نِج بھاشا اُنّتی اہَے، سب اُنّتی کو مُول "). In Marathi the अ remained, and the equivalent of hai is āhe (आहे). Similarly, the Sindhi word for "hai" is āhe.

Derivation of jātā (जाता جاتا) and gayā (गया گیا)[edit]

The word jātā ("goes") is from Sanskrit root (yāti, yāta). ya often becomes "ja" in Prakrit.[citation needed]

The word gayā ("went") is from Sanskrit root gam (gacchati), from gatah.[citation needed] Here t transforms to y in Prakrit.

Ājā (आजा آجا) and dādā (दादा دادا)[edit]

The word ājā has also been used in Northern India and Pakistan for "grandfather". It is indeed derived from arya meaning "sir" in this case.[citation needed] Jains nuns are addressed either as Aryika or Ajji.

The word dādā also has a similar meaning which varies by region. It is used in some regions for "father", in other regions for "older brother", or even for "grandfather" in other regions. This word is an amalgam of two sources:

  • Sanskrit tāta used to address intimate persons which means either "sir" or "dear".[citation needed]
  • Tau meaning "father's older brother" is also derived from tāta.[citation needed]

Baṛā (बड़ा بڑا)[edit]

The word "baṛā" ("older/bigger") is derived from the Sanskrit vridhha through Prakrit vaḍḍha.


The vocabulary of Hindustani includes loanwords from Sanskrit, Persian, Turkish, Arabic, Portuguese, English and Dravidian Languages.

Loanwords from Sanskrit[edit]

Phonetic Alterations[edit]

Many Sanskrit words which were loaned into Prakrits in the pre-modern age underwent phonetic alterations to facilitate ease of pronunciation. In spoken language, these include the merger of श (ś) and ष (ṣ), as well as ऋ (r̥) and रि (ri). Other common alterations were sh (श / ش) becoming s (स / س)، v (व / و) becoming b (ब / ب) and y (य / ی) becoming j (ज / ج). Short vowel sounds were also sometimes introduced to break up consonant clusters. Such words fall under the tadbhava category.

Sanskrit Hindustani English Translation
varśa / वर्ष baras / बरस / برس year
desha / देश des / देस / دیس country
vāsī / वासी bāsī / बासी / باسی inhabitant
yãtra / यंत्र jãtar / जंतर / جنتر device
rātri / रात्रि rāt / रात / رات night
ardha / अर्ध ādhā / आधा / آدھا half
sūrya / सूर्य sūraj / सूरज / سؤرج sun
agni / अग्नि āg / आग / آگ fire
bhaginī / भागिनी bahin / बहिन / بہن sister[7]

Loanwords from Persian[edit]

Persian loanwords not artificially added to Hindustani are derived from Classical Persian, which is a historical dialect, and not the same as modern-day Persian.

Examples of common Persian loanwords and corresponding Sanskrit loans[edit]

Hindustani Meaning Persian Corresponding Sanskrit loan
sāyā / साया / سایہ shadow sāyah / سایه chaya / छाया
pareshān / परेशान / پریشان distressed parēshān / پرِیشان chintit / चिंतित
hameshā / हमेशा / ہميشہ always hamēshah / همِیشه sadaiv / सदैव
xushī / ख़ुशी / خوشی happiness khushī / خوشی anand / आनंद
sabzī / सब्ज़ी / سبزی vegetable sabzī / سبزی bhaaji / भाजी
mehrbān / मेहरबान / مہربان kind mehrbān / مهربان dayalu / दयालु
firdaus / फ़िर्दौस / فردوس paradise firdaus / فردوس parlok / परलोक
dīvār / दीवार / دیوار wall dīwār / دیوار badha / बाधा
darvāzā / दरवाज़ा / دروازه door darwāzah / دروازه dvaar / द्वार
tāzā / ताज़ा / تازه fresh tāzah / تازه nirmal / निर्मल
roz / रोज़ / روز day rōz / رُوز divas / दिवस
shahr / शहर / شہر city shahr / شهر nagar / नगर
dāstān / दासतान / داستان story dāstān / داستان kahaani / कहानी
hind / हिंद / ہند India hind / هند bharat / भारत
sharāb / शराब / شراب wine sharāb / شراب madira / मदिरा

Present Stem[edit]

Some Hindustani words derive from the Present Stem of Persian verbs:

Hindustani Meaning Persian Root Corresponding Sanskrit loan
par / पर / پر wing parīdan / پریدن - to fly passh/ पक्ष
pasand / पसंद / پسند liked pasandīdan / پسندیدن - to prefer chah / चाह
xvāb / ख़्वाब / خواب dream khābīdan / خوابیدن - to sleep sapna / सपना

Past Stem[edit]

Others are derived from the Past Stem:

Hindustani Meaning Persian Root Corresponding Sanskrit loan
āmad / आमद / آمد arrival āmadan / آمدن - to come aagaman / आगमन
shikast / शिकस्त / شکست defeat shikastan / شکستن - to break parajay / पराजय
giraft / गिरफ़्त / گرفت grip giriftan / گرفتن - to take pakad / पकड़

Present Participle[edit]

The Present Participles of some Persian verbs are used to describe something or someone that does the action indicated by the verb:

Hindustani Meaning Persian Root Corresponding Sanskrit loan
āindā / आइन्दा / آینده future āyandah / آینده < āmadan / آمدن - to come bhavisya / भविष्य
parindā / परिन्दा / پرنده bird parandah / پرنده < parīdan / پریدن - to fly panchi / पंछी
zindā / ज़िन्दा / زنده alive zindah / زنده < zīstan / زیستن - to live jeevit / जीवित

Past Participle[edit]

In a similar fashion, Past Participles are used to describe something which has done an action, or has had an action done to it, in the past:

Hindustani Meaning Persian Root Corresponding Sanskrit loan
bastā / बस्ता / بستہ bag bastah / بسته < bastan / بستن - to close jhola / झोला
pasandīdā / पसन्दीदा / پسندیده favorite pasandīdah / پسندیده < pasandīdan / پسندیدن - to prefer priy / प्रिय
murdā / मुर्दा / مُرده dead murdah / مرده < mordan / مردن - to die mritak / मृतक

Loaned Verbs[edit]

Some Hindustani verbs are formed directly from Persian verbs by changing the ending of the infinitive:

Hindustani Infinitive Meaning Persian Infinitive Corresponding Sanskrit loan
ḳharīdnā / ख़रीदना / خریدنا to buy kharīdan / خریدن molna / मोलना
guzarnā / गुज़रना / گذرنا to pass (intransitive) guzashtan / گذشتن beetnā / बीतना
guzārnā / गुज़ारना / گُذارنا to pass (transitive) guzāshtan / گذاشتن bitānā/ बिताना
laraznā / लरज़ना / لرزنا to tremble larzīdan / لرزیدن kaanpna / काँपना
nawāznā / नवाज़ना / نوازنا to patronise nawāḳhtan / نواختن daan dena / दान करना

Derived Nouns[edit]

Nouns formed by adding the ending '-ish' (इश / ـِش) to a verb stem are also used:

Hindustani Meaning Persian Root Corresponding Sanskrit loan
parvarish / परवरिश / پرورش care parwardan / پروردن - to rear paalna / पालन
koshish / कोशिश / کوشش effort kōshīdan / کوشیدن - to try prayatn / प्रयत्न
varzish / वर्ज़िश / ورزش exercise warzīdan / ورزیدن - to exercise parishram / परिश्रम
āzmāish / आज़माइश / آزمائش test āzmūdan / آزمودن - to test parikshan/ परीक्षण

Loanwords from Turkic languages[edit]

There are a very small number of pure Turkic words in Hindustani, numbering as little as 24 according to some sources.[8] Other words attributed to Turkish are words which are common to Hindustani and Turkish, but which have other origins, mostly Arabic or Persian.[9] Both languages also share mutual loans from English. Most notably, some honorifics and surnames common in the Hindustani belt originate from Turkic languages. This is most probably due to the influence of Mughal rulers, who were ethnically Turkic. Examples of honorifics include 'Ḳhānam' (ख़ानम خانم), 'Bājī' (बाजी باجی), and 'Begam' (बेगम بیگم). Common surnames include 'Ḳhān' (ख़ान خان), 'Chuġtāī' (चुग़ताई چغتائی), 'Pāshā' (पाशा پاشا), and 'Arsalān' (अर्सलान ارسلان). Some common Turkic words used in everyday Hindustani are 'Qainchī' (क़ैंची قینچی) - scissors, 'Annā' (अन्ना انّا) - governess, 'Tamġā' (तमग़ा تمغہ) - medal, and 'Chaqmaq' (चक़मक़ چقمق) - flint.

Loanwords from Arabic[edit]

Some of the most commonly used loanwords from Arabic include 'Waqt' (वक़्त وقت)-time, 'Qalam' (क़लम قلم)-pen 'Kitāb' (किताब کتاب)-book, 'Qarīb' (क़रीब قریب)-near, 'Sahī' (सही صحیح)-correct, 'Gharīb' (ग़रीब غریب)-poor, 'Amīr' (अमीर امیر)-rich, Duniyā (दुनिया دنیا)-world, 'Hisāb' (हिसाब حساب)-calculation, 'Qudrat' (क़ुदरत قدرت)- nature, 'Nasīb' (नसीब نصیب)-fate, 'Ajīb' (अजीब عجیب)-unusual, 'Qānūn' (क़ानून قانون)-law, 'Khabar' (ख़बर خبر)-news, Akhbār (अख़बार اخبار)-newspaper, 'Qilā' (क़िला قلعہ)-fort, 'Kursī' (कुर्सी کرسی)-chair, 'Sharbat' (शर्बत شربت)-drink/beverage, 'Qamīs' (क़मीस قميص)-shirt, 'Zarūrī (ज़रूरी ضروری)-necessary, etc.[10]

Loanwords from Portuguese[edit]

A small number of words were borrowed from Portuguese due to interaction with colonists and missionaries. These include the following:

Hindustani Meaning Portuguese
Nāv / नाव / ناو Boat Nau
Anannās / अनन्नास / اننّاس Pineapple Ananás
Pādrī / पाद्री / پادری Priest Padre
Bāltī / बाल्टी / بالٹی Bucket Balde
Chābī / चाबी / چابی Key Chave
Girjā / गिर्जा / گرجا Church Igreja
Almārī / अलमारी / الماری Cupboard Armário
Botal / बोतल / بوتل Bottle Botelha

Loanwords from English[edit]

Loanwords were borrowed from English into Hindustani through interaction with the British East India Company and later British rule. English-language education for the native administrative and richer classes during the period of British rule accelerated the adoption of English vocabulary in Hindustani. Many technical and modern terms were borrowed from English, such as doctor (डॉक्टर ڈاکٹر), taxi (टैक्सी ٹیکسی), and kilometer (किलोमीटर کلومیٹر). The influence of English and assimilation of new loanwords continues to the present day.

Phonetic Alterations[edit]

Some loanwords borrowed from English undergo a significant phonetic transformation. This can be done either intentionally, in order to nativize words or to make them sound more authentic or less 'English', or it can happen as a natural process. Words often undergo a phonetic change in order to make them easier for native speakers to pronounce. Other words are changed due to corruption, where an alternate pronunciation becomes an accepted norm and overtakes the original as the most used pronunciation. Altered pronunciations may also be the result of a lack of English education, or incomplete knowledge of English phonetics.

English Hindustani
Dozen darjan [11] / दर्जन / درجن
Treasury tijorī / तिजोरी/ تجوری
Subtlety satalta / सतलता / ستلتا
Match mācis / माचिस / ماچس
Godown godām / गोदाम / گودام
Bugle bigul / बिगुल / بگل
Recruit raṅgrūṭ / रंगरूट / رنگروٹ
Tomato ṭamāṭar / टमाटर / ٹماٹر
Cabinet kābīnā / काबीना / کابینہ
Kettle ketlī / केतली / کیتلی
Drawer darāz / दराज़ / دراز
Bomb bam / बम / بم
Lantern lālṭen / लालटेन / لالٹین
Butcher būcaṛ / बूचड़ / بوچڑ
Tank ṭaṅkī / टंकी / ٹنکی
Box baksā / बक्सा / بکسا
January janvarī / जनवरी / جنوری


  1. ^ "A Guide to Hindi". BBC - Languages - Hindi. BBC. Retrieved 11 December 2015.
  2. ^ Kumar, Nitin. "Hindi & Its Origin". Hindi Language Blog. Retrieved 11 December 2015.
  3. ^ Masica, p. 65
  4. ^ "aap". rekhta.org. Rekhta Foundation. Retrieved 24 December 2016.
  5. ^ "tum". rekhta.org. Rekhta Foundation. Retrieved 25 December 2016.
  6. ^ "tu". rekhta.org. Rekhta Foundation. Retrieved 25 December 2016.
  7. ^ Morgenstierne, Georg (1950). "Svásā and bhaginī in Modern Indo-Aryan". Acta Orientalia. 21 (1): 27–32.
  8. ^ Anwer, Syed Mohammed (13 November 2011). "Language: Urdu and the borrowed words". dawn.com.
  9. ^ Maldonado García, María Isabel; Yapici, Mustafa (2014). "Common Vocabulary in Urdu and Turkish Language: A Case of Historical Onomasiology" (PDF). Pakistan Vision. 15 (1): 194–225.
  10. ^ Platts, John T. "A قميص qamīṣ, vulg. qamīz, kamīj, s.m. A shirt; a shift; a chemise (cf. It. camicia; Port. camisa)". A Dictionary of Urdu, Classical Hindi, and English. University of Chicago. Retrieved 6 December 2014.
  11. ^ With intrusive hypercorrectional 'r' via arhotic British English
  • Hindi Language and Literature, a site about Hindi's usage, dialects, and history by Dr. Yashwant K. Malaiya, Professor at Colorado State University, Fort Collins, CO, USA.
  • Hindi Language Resources A comprehensive site on the Hindi language built by Yashwant Malaiya
  • Indian Department of Official Language
  • Dua, Hans R. (1994a). Hindustani. In Asher (Ed.) (pp. 1554)
  • Liberman, Anatoly. (2004). Word Origins ... and How We Know Them: Etymology for Everyone. Delhi: Oxford University Press. ISBN 0-19-561643-X.
  • Rai, Amrit. (1984). A house divided: The origin and development of Hindi-Hindustani. Delhi: Oxford University Press. ISBN 0-19-561643-X.
  • Kuczkiewicz-Fraś, Agnieszka. (2003). "Perso-Arabic Hybrids in Hindi. The Socio-linguistic and Structural Analysis". Delhi: Manohar. ISBN 81-7304-498-8.
  • Kuczkiewicz-Fraś, Agnieszka. (2008). "Perso-Arabic Loanwords in Hindustani. Part I: Dictionary". Kraków: Księgarnia Akademicka. ISBN 978-83-7188-161-9.
  • Kuczkiewicz-Fraś, Agnieszka. (2012). "Perso-Arabic Loanwords in Hindustani. Part II: Linguistic Study". Kraków: Księgarnia Akademicka. ISBN 978-83-7638-294-4.