Assamese language and Unicode controversy

Assam is known to many as a valley of lush green forest, blue mountains and the melting pot of many ethnic groups and tribes assimilating to the Assamese culture. Although the word Assamese is often used to mean different groups, the history of Assam dates back to periods before 200 BC. This is evident from the mention of Assam in the Mahabharata and in the writings of the famous Chinese monk and traveller Hiuen Tsang that provides valuable account of the then kingdom of Bhaskar Barman. Over thousands of years, various groups of people immigrated to Assam, from all directions and contributed to Assamese culture and society. Dimasa and Kachari people of the Bodo-Kachari tribe once ruled a large section of the North Eastern states from Koch Bihar, Tripura to Upper districts of Assam. The ruins of a palace, in  Dimapur,  are proofs of their achievements.

With the migration of the Ahom across the Patkai Mountains in the 13th century under the leadership of Siu-Ka-Pha, began another wave of amalgamation of the various groups of the new migrants with the indigenous groups such as Barahi (a branch of the Bodo-Kachari family), Sutiya, etc.  Overtime, most of the ethnic groups made their fair share of sacrifice even to thwart the aggression of the mighty Mughal Empire, as evident that the army of Lachit Borphukan consisted of Mishing, Naga and other tribes. The Koch-Rajbangshis of lower Assam in the medieval ages was a strong kingdom with the unstoppable hero Chilarai. The Vaishnava saint, Srimanta Sankardeva started his religious reform in the Barpeta and river island Majuli.  Despite having such a long and colorful history, in the current times, Assam is known to most outsiders as the home of many tea gardens and the one horned rhino.

Unfortunately, the presence of Assamese text in the Internet or in the digital form is insignificant. This reminds us of the forgotten or lost history of Assam before the advent of the Ahoms to Assam. The buranjis(History) or the royal chronicles documented by the Ahom priests are considered as some of the authentic sources of information on the history of medieval ages in Assam. Although, many of the younger generations of Assam are proud of being Assamese and its glorious history, unfortunately there is a lack of documentation on the culture and history of Assam, both in Assamese and English.  Does it mean one day we will be considered as a nation of people with no noticeable history and culture?

Today Assamese language is used by nearly 13 million people of Assam and the North-Eastern states of India. Currently, there is no electronic database on Assamese history and culture, or even day-to-day events in Assam. The online versions of the popular Assamese daily newspaper, in Assamese, are presented as images. As a result, it makes it impossible to do any Google search on these images on any topic. This handicaps the Assamese-speaking people to carry out any Google like search or any reliable source of information to refer to. Similarly, although the common Assamese does not use English as their primary language, there is no online educational or mass media resource in Assamese text.

Today, almost all records or documentation are done in digitized or electronic format and the letters in the system format are rendered either in ASCII code (American Standard Code for Information Interchange) or in UNICODE. In our context, a code is essentially a number used to represent a symbol. English alphabets are represented using the ASCII code with a total 128 specific characters or code, which includes-letters A-Z, a-z, numbers 0-9, some punctuation, control codes, and some blank spaces. But this is not sufficient to represent various symbols and scripts used all over the world.

In 1991, an international non-governmental body, known as Unicode Consortium, has standardized a Universal Character Set (UCS) for encoding the different languages in the computer systems universally. As a result, Unicode points have been assigned to the glyphs (or characters in a loose sense) used in most Indian scripts. The Consortium has not specifically assigned the Assamese scripts under the Universal Character Set, The Consortium has considered the Assamese alphabets as a part of the Bengali alphabets, through inclusion of a few additional Assamese characters– ‘ra’ and ‘wa ba’.  Fortunately, almost all Assamese characters, including the conjuncts, can be represented on the computer to display and store Assamese text.

Even so, the most compelling reason for the lack of presence of Assamese text and the common person not wanting to use Assamese text on computer is the difficulty of typing in Assamese. Sadly, at present there is a severe shortage of text editing software to write Assamese in an effective manner. There are almost several hundred commonly used glyphs related to Assamese characters. Moreover, the frequent use of conjunct letters adds another level of difficulty. Therefore, it is difficult for the common user to type in Assamese in using a regular keyboard without an easy-to-use software. In most of the publishing shops and newspaper–printing houses, the non-Unicode based Assamese font typing tool Ramdhenu – a  hybrid copycat version Shree-Lipi – is generally used for Adobe Pagemaker in all publication houses – as Pagemaker had no Unicode font support and a discontinued software from Adobe which was replaced by InDesign with Unicode support. This software use non-Unicode propriety font files and reuse the ASCII code points to render non-English glyphs. Here, 127 ASCII codes are circumvented by a set of segmented strokes by representing ASCII codes in a font file. For instance, in an ordinary font file the glyph at the hexadecimal code 0x41 is the shape for the letter “A” but in the custom designed font file it may look something meaningless. Most of the characters are rendered by juxtaposing 2-3 of these strokes. To enable bold, italics and Assamese numerals there are other separate font files. So, several different glyphs are corresponded by an ASCII code depending on the font file resulting in typing characters which involves pressing complex key combinations. This also confines inputting of Assamese texts on computers only to highly skilled professionals and discourages the common person to type in Assamese.  For example, with such non-standard codes, such texts cannot be used on emails or even in Facebook like social media sites.

Due to the above drawbacks and lack of effective Assamese text editing software, Assamese Internet fonts, or any other digitized form is very rare. Since Windows XP, the Microsoft based OS browsers use Vrinda font by default for all Assamese unicode based websites.  Rest, the websites of Assamese print media which use Assamese texts, are mostly present their contents in the form of images of pages. This prevents texts searching and text mining in Assamese documents or even attempts to digitize. This is perhaps the underlying reason for the minuscule presence of Assamese based information storage and retrieval system, both offline and online. These kinds of issues may trigger in the extinction of many documents written in Assamese texts unless it is saved electronically in the digitized format. By considering these shortcomings, many Unicode based text editor software like Leap Office, Shree-Lipi, Ramdhenu Plus, Jahnabi, Rodali, LuitPad are designed for typing Assamese text rapidly. Leap Office – a C-DAC project – was the pioneer of Unicode text editor software for Indian languages including Assamese language support, which was introduced in 2001. It is yet not understood why C-DAC discontinued the Leap Office software after 2001 – but continuing Unicode support with small tools like iLeap and ISM. Following Leap Office, Shree-Lipi was the next to Unicode text editor with Assamese language support, but still very expensive software. Introduced in 2013, Jahnabi has been able to grab the majority attention of Assamese Unicode users while LuitPad grabbed the finalist of Manthan Award 2013. So far, the Government of Assam has not shown any interest for research and development in Assamese writing software.

Additionally, the Government has to take steps in making the Assamese language compulsory in both Government and Private Offices of Assam; and the Assamese society has to be keen in using the Assamese scripts. It appalls the layman that Assam government’s public communications are not available in Assamese language. Or is the Assam government conspiring to make the Assamese nation even weaker and then manipulate the people of Assam with a divide-and-conquer strategy to fulfill their corrupt agenda!  Perhaps the first step to make a nation weaker is to make them ignorant of their history and culture, which will eventually erode their pride as being a part of a culture.

AssamAssameseAssamese UnicodeHiuen TsangJahnabi SoftwareLeap OfficeRamdhenuShree Lipi