Scholars at odds over mysterious Indus script
19:00 23 April 2009 by
http://www.newscientist.com/data/images/ns/cms/dn17012/dn17012-2_300.jpg Tablets and scrolls containing 4500-year-old Indus script were first discovered in the late 19th century, though no one has successfully translated the script (Image: J M Kenoyer/Harappa.com)
http://www.newscientist.com/articleimages/dn17012/2-scholars-at-odds-over-mysterious-indus-script.html Most inscriptions are just a handful of characters long, leading some researchers to propose that the script was used for religious or political imagery, not a written language (Image: J M Kenoyer/Harappa.com)
http://www.newscientist.com/article/dn17012-scholars-at-odds-over-mysterious-indus-script.html The new study contends that ordering of the symbols in Indus script suggests that it is a genuine language (Image: J M Kenoyer/Harappa.com)
An as yet undeciphered script found on relics from the Indus valley constitutes a genuine written language, a new mathematical analysis suggests.
The finding is the latest chapter in a bitter dispute over the interpretation of "Indus script". This is the name given to a collection of symbols found on artefacts from the Indus valley civilisation, which flourished in what is now eastern Pakistan and western India between 2500 and 1900 BC.
In 2002, a team of linguists and historians argued that the script did not represent language at all, but religious or political imagery.
Ordered or random?
From an analysis of the frequency and distribution of the script's characters, the team concluded that it showed few of the hallmarks of language. Most of the inscriptions contain fewer than five characters, few of the characters repeat, and many of the symbols occur very infrequently.
Rao's team assessed the script samples using what is called "conditional entropy". When aimed at language, this statistical technique comes up with a measure for the "orderedness" of words, letters or characters – from totally ordered to utterly random.
"If you look at strings that contain words, then you should see that for any particular word in the string there is going to be some amount of flexibility in choosing the next word, but they're not randomly ordered," Rao says.
Which word next?
For instance, in English text, if you find the fragment "The boy went to the", there is some flexibility in what follows. Nouns like "park" and "circus" make sense, but a verb such as "eat" does not.
Rao's team applied this analysis to Indus script, Sanskrit, an ancient south Indian language called Old Tamil, and English. They also tested the conditional entropy of the Fortran computer programming language and non-languages, including DNA and protein sequences.
Indus script characters turned out to be about as randomly ordered as the other languages. Unsurprisingly, they proved less random than DNA or protein sequences and more random than the computer language, where unambiguity is essential.
"Now we can say, based on this evidence, that they probably were literate, so the big question becomes: Can you get at the underlying grammar?" Rao says. He hopes to refine his team's technique to determine the grammatical structure of Indus script and, potentially, the language family it belongs to.
"I think we are going to need more archival data, and if we are lucky enough we might stumble on a Rosetta Stone-like artefact," Rao says.
Rao's paper has already drawn a strong response from the researchers who proposed that Indus script represents religious and political symbols, not language.
"There's zero chance the Indus valley is literate. Zero," says , an independent scholar in Palo Alto, California who authored with two academics with the goading title "The Collapse of the Indus Script Thesis: The myth of a literate Harappan civilization."
As well as comparing the conditional entropy of Indus script to that of known languages, they compared it with two simulated character sets – one totally random, one totally ordered.
Farmer and colleagues Michael Witzel of Harvard University and Richard Sproat of Oregon Health and Sciences University in Portland that the comparison with artificially created data sets is meaningless, as are the resulting conclusions. "As they say: garbage in, garbage out," Witzel says.
Farmer says that the debate over Indus script is more than academic chest thumping. If Indus script is not a language, a close analysis of its symbols could offer unique insight into the Indus Valley civilisation. Some symbols are more common in some geographical locations than others, and symbol usage seems to have changed over time.
"You suddenly have a new key for unlocking how that civilisation functioned and what its history was like," he says.
"At present they are lumping more than 700 years of writing into one data set," he says. "I am actually going to be working with them on the revised analysis, and we will see how similar or different it is from the current results."
Analysis of the 4500-year-old Indus Script
Despite a large number of attempts, the script of the Indus civilization (circa 2500-1900 BC) remains undeciphered. The absence of a multilingual "Rosetta stone" as well as our lack of knowledge of the underlying language have stymied decipherment efforts. Rather than attempting to ascribe meaning to the inscriptions, we are applying statistical techniques from the fields of machine learning, information theory, and computational linguistics to first gain an understanding of the sequential structure of the script. The goal is to discover the grammatical rules that govern the sequencing of signs in the script, with the hope that such rules will aid future decipherment efforts.
In the opinion of DR S. Kalyanaraman
A possible resolution of the problem lies in thinking out of the box. Many attempts at decipherment have assumed that the signs have to represent alphabets or syllables and many have ignored the reading of pictorial motifs which are very unambiguous. A simple solution is that both signs and pictorial motifs represent words of spoken language. The whole code unravels as related to the repertoire of mine workers, smiths, metal workers, minerals, metals, alloys, furnace/smelter types. See details at http://sites.google.com/site/kalyan97/ presented in 15 volumes. There are rosetta stones such as the tin ingots with glyphs of the writing system. The fact is that many inscriptions of the Indus script also occur on copper plates, metal objects, pointing to the link of invention of the writing system with the invention of alloying to create new metal artefacts during early bronze age. Hieroglyphs are signs and pictorial motifs which enabled communication of words related to these bronze-age artefacts and repertoire of a mine-worker or smithy/mint.
Indus Valley civilisation was literate, new study reports
The4,000 year-old Indus Valley civilisation that thrived on what is now Indo-Pak border might have been a literate society which used a script close to present-day languages like Tamil, Sanskrit and English, reveals a new finding announced on Thursday.
A group of Indian scientists has conducted a statistical study of the symbols found in the Indus Valley and compared them with linguistic scripts and nonlinguistic systems like the encoding of DNA and computer programming.
They found the inscriptions closely matched those of spoken languages such as Tamil, Sanskrit and English.
The results published in the journal Science show that the Indus script could be "as-yet-unknown language." Scientists from the Tata Institute of Fundamental Research in Mumbai, Institute of Mathematical Sciences and the Indus Research Centre in Chennai collaborated with Mr Rao to develop models which helped comparing the symbols with modern languages.
Symbols in any language have some amount of flexibility, or conditional entropy, which helps in analysis of a language structure.
"For example, the letter 't' can be followed by vowels like 'a', 'e', and some consonants like 'r' but typically not by 'b,' 'd' etc. We measured this flexibility (or randomness) in the choice of the next symbol," Mr Rao explained.
Scientists found that randomness in symbols for Indus inscriptions closely matched those of spoken languages. "Despite more than 100 attempts, the script has not yet been deciphered.
The underlying assumption has always been that the script encodes language," Mr Rao said.
The Indus Valley civilisation also known as Harappan civilisation, a contemporary to Egyptian and Mesopotamian cultures, spread across present day eastern Pakistan and northwestern parts of India.
The researchers are now working on deciphering the grammar and rules governing the language. "For now, we want to analyse the structure and syntax of the script and infer its grammatical rules. Some day we could leverage this information to get to a decipherment," Mr Rao said.