Human Prehistory and Language

Perhaps my interest in genealogy is related to my fascination with historical linguistics. Both fields seem to offer some interesting hints about ancient human migration. Even a layman without training can catch a glimpse of some of the fascinating ideas in language science. Although I have no expertise and may even qualify as a full-fledged ``crackpot,'' I share my musings here.

Ancient Conquests

Reading the ancient genealogies I am struck by how much mobility there was. We think of William `the Conqueror' as a Frenchman but in legend his ancestor Rollo had come from Norway less than two centuries earlier. Although that specific ancestry may be bogus, the Conqueror probably was descended of Vikings. The Anglo-Saxon rulers displaced by William had themselves migrated from Northern Europe and have legends that trace back to Central Asia. Of course there is a `survival bias' in these examples: when a people become militarily superior, like the Vikings, it is natural that their young nobles will set out to conquer foreign lands.

The ancient Celtic rulers of Ireland have a tradition that they came from the Black Sea area (near the steppes of present-day Ukraine). Their culture surely did arise from East European roots, as evidenced by their war chariots, use of iron, and their language, but how did their legendary tradition arise? Is it really possible that they retained an illiterate memory of their ancient travels? While the Celtic language is assumed to have arrived in Western Europe overland from Central Europe, the legends of the early Celtic Kings depict their dynasty as traveling via ship. Perhaps it isn't farfetched at all that at least the leadership traveled by sea: Ocean-going Phoenicians were exporting tin from Britain even before the Celts arrived, and obviously at least the last stage of the Celtic conquest of Ireland was by sea.

(On the subject of a Celtic origin in Eastern Europe, it is interesting to compare them with the ancient Tocharian culture of Western China, which shared a number of cultural motifs with the Celtics. It is supposed that the Tocharians also originated near the Ukrainian steppes.)

Just as a military elite propagated faster in space than lower classes, so innovations, like farming or the use of iron, out-raced any migration of people. Often the original population will adopt the new language when advanced newcomers arrive. Thus a small group of people in Eastern Europe who learned to ride horses about 4000 BC left their linguistic mark all over the world. More speculatively, a somewhat earlier group in Southeast Asia learned to smelt bronze and may even have contributed their language to the distant Sumerians, though this linguistic tradition disappeared from West Asia with the arrival of Iron-Age people.

Thus people may get their genes and their technology or language from different sources. Even within genetics we can see a relative of this same dichotomy.

Homo Genealogy

Man (genus Homo) has existed all over Asia for over a million years, but modern man (H. sapiens sapiens) emerged from Africa during the very late Pleistocene epoch. Did modern man completely replace earlier man or did early man's genes survive and evolve under the impetus of advanced African culture? (This may be related to much more recent issues, like Indians having South Asian genes but speaking a European language.)

This is a controversial question with ambiguous evidence. Mitochondrial-Dna studies suggest extinction of early man, with the famous ``Eve of Africa'' theory. On the other hand, some scientists compare the bone structure of million-year old humans in Africa with those in East Asia and find geographical differences that mimic the same geographical differences observed today.

Actually, the ``Eve of Africa'' theory is not inconsistent with the idea that H. erectus genes have survived. Thousands of women alive at the time of ``Eve'' left descendants; they just didn't happen to be uterine descendants. As an analogy, consider the genes of Charlemagne. His Y chromosome (agnatic lineage) disappeared in the 14th century after the untimely death of William de Burgh, 4th Earl of Ulster, but many millions of people today are descended from Charlemagne and may have inherited genes from his other chromosomes.

There is an interesting analogy between the triumph of Eve's mitochondrion and linguistic and cultural propagation. Although we may think mothers always teach their babies their grandmothers' language, in fact, people are often persuaded to adopt the language of an elite. Latin quickly overwhelmed Western Europe even though there was only limited colonization by Roman civilians. It replaced mainly Celtic which was, at the time, itself a recent arrival. Most of the human genome in England and Western Europe predates even the Celtic invasion, yet Basque is the only pre-Celtic language to survive there.

It is difficult to reconstruct man's prehistory, and this discussion implies that even DNA study will not provide complete answers. Historical linguistics provides powerful clues, but it can't ``look back'' too far. How do the languages of 15 millennia ago (the late Pleistocene) line up with today's languages? Europe, Africa and South Asia are hidden in time, but logically we can identify the old Australian language as a prototype of today's Australian, and when the Amerinds arrived in America they must have spoken proto-Amerind. That much is relatively obvious; what is more surprising -- and more controversial -- is that there are enough clues to deduce what present-day languages are related to the language of the stone-age Magdalenian culture in North Asia.

(The claim that proto-Amerind was a single language is controversial but Greenberg and Ruhlen demonstrate a common set of pronouns and familial words across all Amerind language families.)

Dene-Caucasian Language Family

See a big-scale family tree of the world's languages, where Dene-Caucasian is shown as ``C -- Cauco-sinitic.''

By 15 - 18,000 BC, languages related to Dene-Caucasian dominated Northern Asia. One concrete, if circumstantial, supporting fact is the chronology of late paleolithic technology in North America and North Asia. Although early Amerinds developed the advanced Clovis technology, true microlithic technology did not arrive until the migration wave that brought the Na-Dene languages. A very similar toolkit was present in the Dyuktai valley of Northeast Asia as early as 16,000 BC (just after a long cold period when the valley was uninhabited). This doesn't prove that the ancient Dyuktai people are related to the Na-Dene migrants, but Occam's razor suggests it. If the Dyuktai technology preceded the Dene-Caucasian language it would probably have been transmitted to America in a pre-Dene migration wave.

The proto-Amerinds may have occupied parts of North Asia at the same time as the proto-Dene people, but they were, in some way, a weaker people and only the few who made a perilous journey across Beringea survived.

The Na-Dene migration wave occurred after Eurasiatic speakers arrived in Northeast Asia, bringing mesolithic technologies. The paleolithic Dene-Caucasian speakers were now outnumbered and it was their turn to ``flee for their lives.'' Those who made the perilous Alaska journey became the Na-Dene people (Athapaskans) while the major residue of Dene-Caucasian survivors moved South or East probably conquering China shortly after its Neolithic revolution.

John Croft's Speculations on the Late Stone Age

A gray area, where reasoned conjecture is possible, lies between the wild guesswork in reconstructing Paleolithic prehistory and the relative confidence archaeologists have about post-Neolithic times, e.g. the Indo-European reconstruction I show in my final diagram.

Since this intermediate period (13,000 BC - 3000 BC) is too ancient for historic certainty, but not so remote as to require wild guesswork from me, I've provided John Croft's well-reasoned conjectural account of the evolution of Indo-European during this period. He suggests that domestication of the wolf may have been a key advantage the Eurasiatic speakers had over the paleolithic Dene-Caucasian speakers of Europe and North Asia.

Indo-European Language Family

There are many pages on the Web discussing the I-E Homeland. One of many interesting ones is

The Eurasiatic phylum, dominant in Europe and North Asia, may be one of the newer phyla, perhaps splitting off from Afro-asiatic less than 15 millennia ago. One of its major constituent subphyla is Indo-European (I-E), which has been identified as the early language of the first horse riders, the `Kurgans' from Eastern Europe. While early farming societies were egalitarian settlements; the Kurgans were aggressive semi-nomads who eventually imposed caste systems in Greece, India and elsewhere. I-E languages dominated Europe even before Roman ascendancy and in the post-Columbian era rapidly overwhelmed the `New World' as well.

The Indo-European language family split into its component parts about 5 millennia ago. That is already a long time as linguistic change goes, so many cognates in the family are hard to recognize. For example, French cinq and English five both derive from the same Indo-European word for the number five! --

penkwe [PIE] --> penpe [pre-Germanic] --> fi:mf [proto-Germanic] --> fi:f [old English] -->five

penkwe [PIE] --> kwinkwe [Latin] --> cinq

At first it's hard to believe cinq and five could be cognates. The key linguistic change which makes this possible is assimilation. The two consonants in the early word for five are similar in that both are labial -- said with the lips. Numbers are said frequently and quickly so there is a tendency to use a repeated consonant in a two-syllable word. The early Celtic and Germanic speakers chose to duplicate the bilabial p while the Italic speakers duplicated the labialized velar kw. (The latter later lost its labialization -- became just k -- further disguising the cognates.)

Early Indo-Europeans

In addition to horse-riding and chariots, innovations of the Kurgans may have included domestication of sheep, cattle and barley (and eventually the associated by-products of wool, cheese, and beer, respectively), use of hemp and leather cords, and battle-axes. But the mobility provided by oxen-driven wagons and/or horseback riding was surely the key to their rapid ``conquest'' of Europe.

The Kurgans probably didn't overwhelm Europe or India with their numbers; they came as small bands of adventurers rather than vast armies. Instead of succumbing to Kurgan military power, prior inhabitants may have happily adopted the Kurgan culture and language as a means to share their technical innovations. (It is said that the ancient Ertebolle people of Denmark had resisted the Neolithic revolution for many centuries, but began planting barley when they learned the Kurgan recipe for beer!) There is an interesting cultural argument that explains why the indigenous farmers adopted the Kurgan language, rather than vice versa. The free-wheeling Kurgans would have welcomed newcomers able to communicate, while the earlier farmers in Southern Europe and India formed conservative homogeneous societies (pre-conscious in Jaynes' terminology) which were unprepared to accept, or learn from, outsiders.

Here's a map showing the origins and splitting of the early Indo-European languages. All dates are 1000's of years BC, and very approximate. Geographic locations are also very approximate. The place where Hittite split off from the rest of I-E (about 4500 BC ?) is unknown and just placed arbitrarily here.

There are various ways to guess these dates. Greek, Balto-Slavic and Indo-Iranian are so close linguistically that separation couldn't have occurred much before 3000 BC, but there is no historical evidence as would be expected if separation occurred more recently than the arrival of Greek-speaking Achaeans in Greece about 2200 BC. Separation of Tocharian and Western European from Indo-Slavo-Greek must have occurred earlier (association with Afanasievo and Corded Ware cultures respectively, each ca 3000 BC, is widely accepted) but probably after chariots and wagons were introduced in the Ukraine ca 3500 BC. (Cognates of `wheel' (*kwe-kwlo) are shared across most non-Hittite I-E branches, including Germanic and Tocharian.)

Five extinct I-E subphyla and 8 of the nine surviving subphyla are shown. The ninth, Albanian, may be linked to one of three otherwise extinct Balkan subphyla: Illyrian (Ill.), Thracian (Thr.) or Phrygian (Phr.).

In addition to the horse-rider shown in Ukraine 4000 BC, some important non-linguistic dates are shown in a distinct color: Farming arrived in Southern Europe about 7000 BC and spread to Central Europe (with a signature ``LBK'' pottery) about 5500 BC, developing into the Funnel-Beaker Culture (TRB) of Central European farmers about 4000 BC.

The `Kurgan' horse riders from Ukraine are widely accepted as the source of early Indo-Iranians, Greeks, Armenians and Balto-Slavics, but the source of Italo-Celtic and Germanic are controversial, as are the languages of the LBK and TRB cultures. Like the Kurgans, the TRB people used wheeled carts, and there was surely some communication between these two cultures. A popular theory identifies the first intrusions of Indo-Europeans into Central Europe as the Corded-Ware (Germanic?, ca 3000 BC), Battle-Axe (Balto-Slavic?), and Bell-Beaker (Italo-Celtic? ca 2500 BC) cultures, but it is disputed whether these evolved from TRB, or resulted from a Kurgan invasion.

It seems likely that a major linguistic impulse (call it ``Old European'') from SW Asia overwhelmed Europe a few millennia before the Kurgan culture, providing such languages as Etruscan, Pelasgian and, presumably, the language of the LBK culture, but none of these languages survive. (Basque probably has a Central Asian or North African origin more ancient than Old European.)

Some theorists believe that Old European was a sibling or parent of I-E itself, and that the Kurgans arrived in Ukraine from the West rather than the East. This view is supported by linguistic analysis of I-E's ``substrate'' (borrowings). In North Central Asia, I-E would have borrowed from Uralic and Caucasian and, only in the Western branches, ``Old European.'' In Asia Minor, I-E would have borrowed from Semitic and Kartevelian. In fact, the I-E substrate is Semitic and Kartevelian: for example the I-E words `taurus' (bull), `goat/kid', `star', `wine' and `seven' are all Semitic.

The most elaborate version of this theory leads to a Homeland in the Balkans and a chronology slightly earlier than that shown on the map above, perhaps with the groups moving SE becoming Hittite, a group moving North to merge with the Ertebolle-Ellerbek culture to form TRB, which spoke Germanic, groups moving East to become the Tocharians and Kurgans, and the Celtic and Italic peoples emerging at some point in Central Europe. In this theory, the Kurgans were still I-E progenitors, but the Western branches (Germanic, Celtic, Italic) were already evolving in Central Europe before any `Kurgan invasion.''

This theory leads to a chronology which seemed satisfactory to me once, but I've since switched to the opposite side, and now believe that the Gimbutas/Mallory explanation of I-E origins is obviously correct. I'll show my old remarks, and then explain why I was idiotic.

If I-E were born in NW Asia instead, one would be left with two big enigma: (1) Why has the language of the advanced TRB culture disappeared completely? ...

So What? Similarly, the Continental Celtic languages, which dominated much of Europe 2000 years ago, are extinct. (My other complaints were even more idiotic.)

The summary overview of the controversy has much to do with ``what looks like a homeland?'' which has much to do with geographical linguistics. The Austronesian may be assumed to have arisen in Formosa, based on a fulcrum of fanout. Trying this with Indo-European, the fulcrum is Black Sea, or more accurately, a little to its Northwest.

Most of the I-E branches seem to have originated near the Balkans, does this mean the Proto-IE people were Balkans or Balkano-Danubians?

I don't think so; indeed the Balkano-Danubist theory almost defeats itself with its very name, since the Danubian/Kurgan or Western/Balkan/Kurgan linguistic split their theory requires, doesn't really present itself. The IE tree is drawn as either ca. 10 branches breaking off simultaneously, or single branches splittling from a core every 2 centuries or so, rather than the 2-4 major branches that would be implied in any Balkan/Kurgan or Danubian/Kurgan system.

Amplifying on this last point, any Danubist theory will imply a distinctive breakdown into subfamilies, perhaps

     (Hitt ((Celt Ital) (Toch ((Balt Slav) Germ ((Hell Arme) (In-Ir))))))
  Old-Balkan  Danubian          Tripolye         Sredny Stog  Yamnaya

(There are other ways to map I-E to cultures in a Danubist or Balkan theory, but the above is favored by Danubists and may be one of the least implausible.)

The traditional view of I-E is that there is at most this much substructure:

     (Hitt (Celt Ital Toch (Balt Slav) Germ Hell Arme (In-Ir)))
         <--- "Secondary Prods. Revolution" or horse riding ---->

For their theory to be plausible, I-E Balkan/Danubists must posit that the lack of clear structure among I-E branches is anomalous, and that a correct tree structure explaining their Balkan/Danubian I-E expansion might come into view under the proper linguistic microscope.

Fortunately, colleagues of Prof. Ringe of Univ. Penn. have developed such a "microscope" and Ringe's I-E family tree is now well respected. (The mapping to cultures is mine, not Ringe's.)

     (Hitt (Celt Ital Toch (Hell (Arme (Germ* ((Balt Slav) (In-Ir)))))))
    Kwave-1    Kwave-2        Kwave-3   B.Axe/Corded_Ware  Andronovo

(Germ* denotes the dominant superstratum in a Kwave-2/Kwave-3 hybrid language, as implied by Ringe's work. Kwave denotes a numbered Kurgan invasion in Gimbutas' theory.)

It seems impossible to map this structure to a Balkan or Danubian origin theory; yet it clearly shows a Kurgan I-E Homeland spinning off a new subphylum every few centuries.

Lack of affinity between Tocharian and Kurgan/Andronovo is always hard on a Balkan/Danubist theory, but note also that there's no way to arrange (Latin Greek Armenian Sanskrit) into a tree structure which fits both Ringe's work and a non-Kurgan I-E.

I think the illusion that the Balkans are the fulcrum of I-E procreation is explained very simply. The actual fulcrum, near the Ukraine, is very close to the Balkans and has a tendency to thrust (militarily) towards the Balkans. Kurgans were settling in Central Romania, just to the north of the Balkans, during the fifth millenium BC. As Kurgans (I-E speakers) left the I-E homeland (East European, or Pontic-Caspian, steppes) travelling southwest towards the agricultural riches and advanced metallurgy of the Balkans, the dialect of the emigrants would begin deviating from the core, possibly as they crossed the Danube River or thereabouts. Thus six or more of the branches of I-E would appear to derive from a Balkan fulcrum, but actually each be a particular wave of Kurgan-speaking immigrants to South Central Europe. The proto-Hittites probably split off this way ca 4000 BC, the proto-Celts ca 3600 BC, the proto-Greeks ca 3400 BC, the proto-Armenians ca 3200 BC and so on. (Note that this pinpoints the Satem shift as 3300 BC !) The microstructure that this would apply for the I-E family tree agrees with that determined by Professor Ringe of Pennsylvania.

An interesting way of looking at the I-E Homeland problem is to note that the major candidates, Ukraine, Balkans, poss. Anatolia, would be adjacent and almost equivalent if the Black Sea weren't interposed as a barrier.

But note that the ``barrier'' is an illusion. A coast or sea operates more as a conduit than barrier, especially if it's fresh water. A ``Great Flood'' is now supposed to have occurred in a shallow fresh-water Black Sea basin ca 5600 BC. This date corresponds with the rise of advanced farming and stockbreeding cultures to the North and Northwest of the Black Sea.

Scientists can't predict next week's weather, so it's no surprise they have trouble reconstructing linguistic change five or ten millennia ago. Yet the clues are tantalizing.

Mystery of India

Today, all European languages come from the Eurasiatic phylum, with the sole exception of Basque. Doubtless there were other language phyla in Europe many millennia ago, but they have disappeared. India, on the other hand, still has languages from five of the twelve present-day phyla, as well as the unclassifiable Nahali language. With so many residual ancient languages in India, can their prehistory be guessed? Perhaps not; there may have been even greater linguistic diversity in ancient times and many of the important phyla may now be extinct.

Although there are 250 different Tibeto-Burman languages, their diversity is much less than a family like North Caucasian. With a mountainous habitat promoting diversity, one concludes that Tibeto-Burman is a recent arrival, appearing on the subcontinent long after the Neolithic revolution. Indo-Iranian (I-E) also has about 250 distinct languages, but is an even more recent arrival than Tibeto-Burman.

When people acquire a new language, they often retain vocabulary and other linguistic features of their older language. By careful analysis of Indo-Iranian and Tibeto-Burman, linguists can isolate such ancient linguistic traditions and even arrange them chronologically. When this is done, they find that Mundic must have been an important and widespread language family in prehistoric India. They also find an even more ancient ``Gangetic Language X'' which cannot be linked genetically to any present-day language.

Dravidian also shows up as a ``substrate language'' in Northern India and is widely believed to be the Harrapan language, but the matter is very controversial. (So-called Indocentrists make I-E the Harrapan tongue, but this theory is scoffed at outsied of India.) Rather more than to I-E, Dravidian seems to have lent its vocabulary to West Asian languages in the Uralic family. Theorists conclude therefore that Dravidian is a post-Neolithic arrival from West Asia. (Some theorists trace Dravidian back further to West Africa. The details of Dravidian prehistory are a mystery, but if it did traverse Egypt it may have left a clue: the ancient Dravidians and Egyptians both worshipped snakes.)

Mundic, Burushaski, Nahali (or rather their ancient precursors) and the hypothesized Language X are the candidates for ancient Indian languages predating the Harrappans. (Not to mention Indo-Pacific, which has left little trace.) Another ancient language with no known relationship to any other language is Kusunda -- recently extinct -- which was spoken in the Himalayan region before the arrival of Tibeto-Burman. Yet another ancient Himalayan language is called pre-Tharu; it is also extinct but makes its presence known in the substrate of Napal's Tharu languages.

The Indus River (Harappan) bronze-age Civilization

Conventional histories emphasize the flood plains of the Nile, Tigris-Euphrates and China's Yellow River as the cradles of civilization with the Indus River as an ``also-ran'' but the early Indus civilization was in many ways the greatest of all. Villages appeared there about 6500 BC, a thousand years before the villages on the lower Nile. But while the Nile and Tigris-Euphrates civilizations have changed hands many times over the millennia, the Indus civilization didn't change hands -- it just disappeared without a trace.

Between about 2500 BC and 1750 BC a bronze-age culture flourished in present-day Pakistan and NW India. (It may have evolved from the Kot-Diji civilization ca 3000 BC.) In its day it was both the largest and the most advanced civilization in the world. Homes were made of fire-hardened brick; cities were well-organized and even had sewer systems. There is little evidence of any religion, nor of military activity. Like most early agrarians the Harappans seem to have had an egalitarian society. Their crafts were very advanced, as were their sea-faring skills. They engaged in trade with the Sumerian civilization to their West. Fragments survive of a beautiful pictographic script which has never been deciphered. No one knows what modern language the Harappan language most resembled.

The names of rivers are among the most conservative of words, often surviving even when a population is replaced. Just the sound of the word ``Mississippi'' tells us that the first Americans didn't speak English. But this doesn't seem to help much towards solving the Harappan mystery; they probably inherited a placename vocabulary from an earlier period and might themselves have wondered what the earlier inhabitants spoke.

Although only 24 distinct Mundic languages are recognized today, many scholars believe that an ancient Mundic language (or related Austric language) was prevalent in prehistoric West India. Perhaps the Harappans and even the Sumerians spoke an ancient Mundic language. One scholar finds strong similarities between Harappan script and a rare Mundic script (though spoken languages and scripts often have unrelated origins). Another scholar finds similarities between Harappan script and Easter Island script! (The Easter Island language, Polynesian, is classed in the same Austric superphylum as Mundic.) This gives rise to a provocative (if far-fetched) theory since the Polynesians and the Harappans were the two greatest seafaring people before the rise of the Phoenicians, and the rise of Harappan seapower is roughly contemporaneous with the arrival of Polynesians in the Indian Ocean. (The Polynesians certainly investigated the Indian coast, although most of their permanent settlements were on islands.)

Other experts prefer Burushaski as their guess for Harappan language, and a small fringe group (the ``Indocentrists'') claims that the Harappans spoke an Indo-European language. Dravidian was once the default choice for the ancient Harappan language, but that theory seems to have many skeptics today.

The Harappans suffered from the erratic behavior of their two rivers, the Indus and Saraswati, and many of their cities had to be completely rebuilt several times. Finally, about when the Saraswati River disappeared for good, the Harappan civilization disappeared, and none of today's ethnic Indian groups claim it as their heritage.

The Vedic Aryans

The dominant cultural tradition in today's India is that of the Vedic Aryans. They appear in history a few centuries after the downfall of the Harappans but prepared written legends that go further back. Both India's famous caste system and its dominant religion, Hinduism, trace their roots to the Vedic Aryans. There is no controversy about the Vedic language. It was Indo-European, and eventually evolved into the dominant written language of Sanskrit.

Where did the Vedic Aryans come from? Their own histories imply they came from Eastern India but there is very little historical or archaeological evidence. It is the nature of a caste system that a second-tier of people is elevated to support the small first-tier population, and there would be several good reasons why histories like the Rgveda would document the second-tier people rather than the first-tier, especially if the latter were illiterate foreign invaders. The Aryan invaders provided India with a new religion and a new language, but the ancient Indian folklore was modified, not replaced. Much confusion in the Indian mystery dissolves when this is recognized.

The Rgveda therefore can't be trusted and archaeological evidence is ambiguous, but one science clarifies where the first-tier Aryans must have come from: linguistics. The Aryan languages are so similar to Slavic and Greek that the Aryan precursors could not have left the common ``I-E Homeland'' before 3000 BC and probably left even later. Either the language now dominant throughout northern India arrived from West Asia during or after the Harappan era, or India itself must have been the I-E Homeland. The latter case completely conflicts with linguistic evidence, but in the former case, one imagines small troops of warriors from Northwest Asia crossing the Khyber Pass and, with superior military technology (horses, armor), eventually dominating an entire subcontinent. This process occurred only a few centuries before the Celtic conquest of Western Europe, but while the Celts have proud legends of their origin far to the East, the Aryans of India were so outnumbered they may have felt a need to disguise their foreign origin, perhaps even inventing a religion for a similar purpose.

The earliest Aryan invaders may have cohabited with the Harappans, with the main invasion arriving at or after the Harappan collapse. That such a magnificent civilization disappeared completely suggests that the Aryan invaders were a destructive influence, but there is no evidence that the Aryan invasion precipitated the Harappan collapse. In fact there's only circumstantial evidence that there was an Aryan invasion at all. A `DNA signature' is visible with most major prehistoric migrations but there is no such signature for the presumed `Aryan invasion.' The Hindu Kush Mountains were a formidable obstacle, so the `invasion' may have been pulled off by a small number of male warriors who didn't bring their womenfolk -- yet it led to almost complete replacement of the linguistic base of Northern India.

No history book records the `Aryan invasion' (though if it truly led to the invention of Hinduism and the imposition of the caste system it was a profound event) and most archaeological evidence is ambiguous or controversial. The bronze-age civilization of Bactria (Afghanistan) contemporaneous with the Harappans may be the `smoking gun' which demonstrates Indo-European arrival with signature I-E elements like sacrificial altars and pictures of trumpets (denoting I-E cavalry). However the strongest evidence that an `Aryan invasion' must have occurred uses tantalizing cultural and, especially, linguistic clues.

When the vocabularies of Balto-Slavic and Indo-Iranian are compared, it is found that they share common words for temperate animals and crops, but have developed independent words for tropical things. This is a strong suggestion that the Aryans came from temperate Northwest Asia: the Balto-Slavs certainly didn't come from tropical India.

Linguistic diversity is always greatest in the place where language has been entrenched the longest. Linguists are confident that Formosa was the original homeland of the Polynesian people even though there's no archaeological evidence of it: there is more diversity among the aboriginal languages of Formosa than among all other Austronesian languages put together. The lack of diversity among the Indic branches of I-E yields a strong argument that India cannot be the I-E homeland: even Persian is classed in the same subphylum as Indic, and it must have split off before Hindu literacy (Sanskrit) acted as a brake on Indic diversity.

The theory that the Harappans were related to the Aryans is also suspect. The Rgveda tales describe a rural culture with primitive dwellings of stone and bamboo: where is the advanced civilization of the Harappans? The Harappans ate fish and mollusks; the Aryans abhorred all seafood. Harappan society was egalitarian, perhaps matriarchal; the Aryans had a male-dominant society with ritual sacrifices. Harappan art emphasized bull, tiger and elephant; Aryan art emphasized horse and cow (eventually leading to Hindu cow worship). In two areas traditional Aryan technology exceeded Harappan: they had iron and spoked wheels. Each of these was a Central Asian or Eastern European invention.

I hope no one takes offense at my suggestion that 35 centuries of Indian culture are based on a geographic fraud committed by a few Aryan invaders. I have no ethnic agenda but just find such reconstruction fascinating.


Ancient human cultural innovations diffused throughout the world rapidly. Although trying to relate these innovations to specific linguistic impulses (e.g., Cauco-Sinitic = Magdalenian tools; Austro-Asiatic = bronze smelting; Eurasiatic = domestication of wolf; Indo-European = domestication of horse) may be unreliable and oversimplified, the linguistic associations do provide tantalizing clues about some of the details.

A Tri-Modal Curiosity in Linguistics

(Muhammed Ali beat George Foreman, and George Foreman beat Joe Frasier. But Joe Frasier beat Muhammed Ali! A simpler version of the same ``tri-modal equilibrium'' is the Rock-Scissors-Paper game played by children.)

Languages change in predictable ways. It is tempting to compare with physics where excess potential energy is expelled: Water flows downhill; Linguistic forms become more relaxed. In such schemes, we don't expect flowing water to return to its origin (except in an Escher drawing). The Ali-Foreman-Frasier paradigm is more complex than a simple minimization scenario, and Ali did return to be champion!

Language relaxes. It may come as no surprise that our words ``chimney'' and ``nite'' come from Middle English ``chimenee'' and ``night'' but it would be unexpected for the third syllable or extraneous fricative to reappear in the future. Similarly if our schoolteachers indoctrinated their students to articulate ``I have to go'' we'd be saying ``I hafta go'' again after a few decades. (These examples of linguistic ``relaxation'' involve physical relaxation of lips and other vocal apparatus, but I mean the term in a more general way to include relaxation of grammatical rules, and so on.)

Spanish padre and English father are cognates, but a linguist could state with some confidence that the former word is probably closer to the ancient form. The phonetic transformations p --> f and d --> th are much more common than their inverses: human speech evolves towards the more relaxed form.

Similarly, grammatical forms relax. The German definite articles der, das, die, dem, den, etc. have evolved into the more relaxed single-word formula of English, where only the exists. (Since the German articles all mean almost the same thing, it's simpler -- more ``relaxed'' -- to render them the same.) The way German mutates the sounds within a word to distinguish cases is called inflectional grammar. English replaces this with an isolational grammar in which fewer, fixed words are used, with the semantic role of case inflections taken over by word order, or added word particles (e.g. prepositions). German and English are relatively close; a comparison of Latin with English would show even more examples of the inflection --> isolation transformation. Better yet, compare any Indo-European language with the reconstructed ``Proto-Indo-European'' language: PIE has more grammatical cases than any of its daughter languages.

But now the paradox. If inflectional grammar has a tendency to relax to an isolational form, doesn't this imply that, after so many millennia, human speech would have ``relaxed'' to isolational grammar long ago? PIE developed from much older languages; where did its inflections come from?

The solution to this paradox arises from the fact that isolational grammar is not stable but has its own preferred evolutionary path. When inflected forms are collapsed into single isolated words, meaning is lost, but the resultant vocabulary of fixed words provides its own solution. In the sentence ``you all will have read this'', ``all'' is combined with ``you'' since English has lost the inflection which distinguishes second-person plural from singular, and ``will have'' are inserted since English has lost most of its inflected tenses. The two pronunciations of ``read'' survive as evidence that English still retains some inflectional grammar. Combining words, when done in a regular way, is called agglutinative grammar. Over time, such regularization means that an isolational grammar will tend to evolve into an agglutinative grammar.

Eventually, in agglutinative grammar, long phrases are repeated so often that it becomes natural to say them quickly, distorting syllables or even blurring them together. For example the French seven-word phrase ``Qu'est-ce qu'il y a'' is pronounced with just three syllables ``Kes kil ya'' (and has a meaning similar to the single English question syllable What's). This final relaxation mode changes agglutinative grammar into inflectional grammar. We've come full circle! The transformations are not automatic, and a language family may exist for many millennia without ever going through an inflectional phase, but this does explain how inflectional grammars can arise.

What does this have to do with Ali-Foreman-Frasier? Just that at least three modalities are required to develop this dynamic behavior. With only two heavyweight contenders, Ali would always beat Foreman and the crown wouldn't change hands. But since Frasier could beat Ali, but lost in turn to Foreman, there was no single-champion equilibrium.

In the same way, the three grammar relaxation modes

keep linguistics from getting boring!


Go back to James Allen's home page.