SINCE the dawn of civilisation, man has been asking questions such as ‘who are we?’ and ‘where have we come from?’ Until 1858 it was universal belief that man is special creation of God. In 1858, based on phenotypic transition of various organisms including plant and animal species, Charles Darwin proposed the theory of evolution and wrote a book The Origin of Species. Eight years later in 1871, he wrote a book The Descent of Man. Based on the anatomical similarities, he declared that the chimpanzee and the gorilla are our closest living relatives and predicted that the earliest ancestors of humans would turn up in Africa, where our ape kins live today. Now it is widely accepted view that modern human diverged from a common ancestor of chimpanzee and human nearly 6-7 million years ago. Based on fossil records found in Africa, it is now believed that modern human originated from a single mother about 160,000 years ago in East Africa. East-African mega-droughts between 135 and 75 thousand years ago, when the water volume of the lake Malawi was reduced by at least 95 per cent, could have caused their migration out of Africa. The obvious question to ask is which route did they take? Our study of the tribes of Andaman and Nicobar Islands using complete mitochondrial DNA sequences, and its comparison with the mitochondrial DNA sequences of the world populations available in the database, led to the theory of southern coastal route of migration through India, against the prevailing view of northern route of migration via Middle East, Europe, south-east Asia, Australia and then to India. Our earlier study revealed that Negrito tribes of Andaman and Nicobar Islands, such as Onge, Jarawa, Great Andamanese and Sentinelese, are probably the descendants of the first man who moved out of Africa.
This raised many questions such as: (i) what is the origin of mainland tribal and caste populations?; (ii) are there any population(s) in mainland India, which are close to Andamanese?; (iii) how much affinities the Indian populations have with Andamanese?; (iv) did the Indians contribute to the early human spread?
In order to answer these questions and to explore the ancient history of India we have harnessed genomic technology.
Ancient roots for India’s rich diversity
India represents one of the largest human biodiversity pool in the world. There are 532 tribes, 72 primitive tribes and 36 hunters and gatherers. Although the genome sequences of any two unrelated people differ by just 0.1 per cent, that tiny slice of genetic material is a rich source of information. It provides clues that can help reconstruct the historical origins of modern populations. It also points to genetic variations that heighten the risk of certain diseases. In recent years, maps of human genetic variation have opened a window onto the diversity of populations across the world, yet India has been largely unrepresented until now.
To shed light on the genetic variability across the Indian subcontinent, we analysed 132 Indian samples from 25 groups on an Affymetrix 6.0 array, yielding data for 587,753 SNPs after restricting to markers with good completeness. To span the widest range of ancestry in India, we sampled tribal groups from 13 states and 6 language families (Indo-European, Dravidian, Austro-Asiatic, Tibeto-Burman, Great Andamanese and Jarawa-Onge). We also sampled caste groups mostly from Uttar Pradesh and Andhra Pradesh to permit comparison of traditionally “upper” and “lower” caste groups after controlling for geography. With tens of thousands of independent loci, we could estimate Fst (F-statistics) – accurately with just 2-9 samples per groups (with average standard error of + 0.0011). We also merged our data with 155 European (CEU), Chinese (CHB), and West African (YRI) samples from HapMap, and 938 samples from the Human Genome Diversity Panel (HGDP).
We analysed these data to address five questions about Indian genetics and history. Does the Indian subcontinent harbour more structure than Europe? Has strong endogamy been a long-standing feature of Indian groups? Do nearly all Indians descend from a mixture of populations, one of which was related to Central Asians, Middle Easterners and Europeans and probably lived in north India? Are tribal groups systematically different from castes, and do some tribal groups provide a good approximation for the ancestral populations of India? What is the origin of the indigenous Andaman Islanders?
All mainland Indian groups have inherited a mixture of ancestries
We provide strong evidence for two ancient and genetically divergent populations that are ancestral to most Indian groups today. One, the “Ancestral North Indians” (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, while the other, the “Ancestral South Indians” (ASI), is not close to any group outside the subcontinent. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39-71 per cent, and is higher in traditionally upper caste groups and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India.
The finding that nearly all Indian groups descend from mixtures of two ancestral populations applies to traditional “tribes” as well as “castes”. It is impossible to distinguish castes from tribes using the data. The genetics prove that they are not systematically different. This supports the view that castes grew directly out of tribal-like organisations during the formation of Indian society. The one exception to the finding, that all Indian groups are mixed, is the indigenous people of the Andaman Islands, an archipelago in the Indian Ocean with a census of only a few hundred today. The Andamanese appear to be related exclusively to the Ancestral South Indian lineage and therefore lack Ancestral North Indian ancestry. In this sense, they are unique. Understanding their origins provides a window to look into the history of the Ancestral South Indians, and the period of tens of thousands years ago when they diverged from other Eurasians. Our project to sample the disappearing tribes of the Andaman Islands has been more successful than we hoped, as the Andamanese are the only surviving remnant of the ancient colonisers of South Asia.
Our findings revealed that many groups in modern India descend from a small number of founding individuals, and have since been genetically isolated from other groups. In scientific parlance, this is called a “founder event”. It has medical implications for Indian populations. Recessive hereditary diseases – single gene disorders that occur only when an individual carries two malfunctioning copies of the relevant gene – are likely to be common in populations descended from so few ‘founder’ individuals. Mapping the causal genes will help to address this problem. The widespread history of founder events in Indian populations helps to explain why the incidence of genetic diseases among Indians is different from the rest of the world. For example, an ancient deletion of 25 bp in the cardiac myosin-binding proteins-C gene (MYBPC3) is associated with heritable cardiomyopathies as well as with an increased risk of heart failure. Its prevalence is high (~4 per cent) in the general populations from the Indian subcontinent. However, this mutation is completely absent among the people from the rest of the world.
The finding that a large proportion of modern Indians descended from founder events means that India is genetically not a single large population; instead it is best described as many smaller isolated populations. Founder events in other groups, such as Finns and Ashkenazi Jews, are well-known to increase the incidence of recessive genetic diseases; and our study predicts that the same will be true for many groups in India. It is important to carry out a systematic survey of Indian groups to identify which ones descend from the strongest founder events. Further studies of these groups should lead to the rapid discovery of genes that cause devastating diseases, and thus will help in the clinical care of individuals and their families who are at risk.
Indo-European family of language and the concept of Aryan and Dravidian
The story of Indo-European family of languages was proposed by Sir William Jones before the Asiatic Society at Calcutta in 1786 (Jones, 1786). The Indo-European concept was a real breakthrough of scientific linguistics, linking languages widely separated in space, forming two blocks – an eastern one of Persian and Indic languages and a western European block, separated from one another by Semitic and Turkic languages. The Indo-European concept was anything but obvious – the idea, that is, that the two blocks of languages, so distant from one another, are nevertheless related to one another. Its discovery by Jones and others not only created a new science of language but it radically recorded existing ideas about the relations among different natives or races of people. Jones (1746-1794) was an employee of the East India Company who developed the Indo-European concept. He also made important identifications of words in the Romanic or Gypsy languages with Sanskrit (Jones, 1786). Marsdens’ (William Marsdens 1754-1836) early paper, comparing the Gypsy language with Hindustani, makes him one of the co-discoverers of its Indian origins.
Max Muller, who was one of the first to apply the Aryan name to the Indo-European concept identified the racial-linguistic entity as racially white and was instrumental in the formation of the racial theory of Indian civilisation.
The kings of South India, like the Chola and the Pandya dynasties, relate their lineages back to Manu. The Matsya Purana moreover makes Manu, the progenitor of all the Aryans, originally a south Indian king, Satyavrata. Hence these are not only traditions that make the Dravidian descendants of Vedic rishis and kings, but those that make the Aryans of North India descendants of Dravidian kings. The two cultures are so intimately related that it is difficult to say which came first.
The present research findings are consistent with the view of one school of thoughts that the Aryans and Dravidians are part of the same culture and we need not speak of them as separate. However, it contradicts the second school of historians such as Max Muller who for the first time applied the Aryan name to the Indo-European concept. It strongly suggests that dividing them and placing them at odds with each other serves the interest of neither but only serves to damage their common culture.
Our study is important in highlighting important questions still open for future research. One priority is to estimate a date for the ANI-ASI mixture; this may be possible by studying the length of stretches of ANI ancestry in modern Indian samples. Inferring a date is important, as we expect that it would shed light on the historical process leading to the present day structure of Indian groups. A second priority would be to follow up on the observation that many Indian descended from a small number of founders. The groups with the strongest founder effects can then be analysed to identify genetic variants that we predict will account for substantial rates of recessive disease in these groups. Have Eurasians descended from the Ancestral North Indians? This is the question we would like to address in our future research activities.
Dandapany PS, Sadayappan S, Xue Y, Powell GT, Rani DS et al. (2009) A common Cardiac Myosin Binding Protein C variant associated with cardiomyopathies in South Asia. Nature Genetics 41, 187-191.
Reich D, Thangaraj K, Patterson N, Price AL, Singh L (2009) Reconstructing Indian population history. Nature 461, 489-494.
Scholz CA, Johnson TC, Cohen AS, King JW et al. (2007) East African megadroughts between 135 and 75 thousand years ago and bearing on early-modern human origins. Proc. Natl. Acad. Sci. (USA) 104, 16416-16421.
Thangaraj K, Singh L, Reddy AG, Rao VR, Sehgal SC et al. (2003) Genetic affinities of the Andaman Islanders, a vanishing human population. Curr Biol. 21, 86-93.
Thangaraj K, Chaubey G, Kivisile T, Reddy AG, Singh V, Rasalkar A, Singh L (2005) Reconstructing the origin of Andaman Islanders. Science 308, 996.
Thangaraj K, Chaubey G, Kivisile T, Reddy AG, Singh V, Rasalkar A, Singh L (2006) Response to comment on “Reconstructing the origin of Andaman Islanders”. Science 311, 470b.
Trautmann TR (2004) Discovering Aryan and Dravidian in British India. Historiographia Linguistica. XXXI : 1, 33-58.
The International HapMap Consortium (2007) A second generation human haplotype map of over 3.1 million SNPs. Nature 449, 851-861.
(The witer is Bhatnagar Fellow (CSIR) and former Director of Centre for Cellular and Molecular Biology, Hyderabad)