Locally Yours
Indian language (Indic) computing has come a long way. Although it still has miles to go, the journey so far has been quite an impressive one. KTP Radhika logs in to check the scenario
Localization is the buzzword in business and technology today. The impressive progress being made in local language computing in India stands testimony to the success of this concept. In a country like India, where the majority of the billion plus population isn’t comfortable using English, local-language computing holds immense potential for business as well as for communication.
Inventions and advances in technology have helped local language computing grow. The advent and development of Unicode, local language fonts and rendering systems, along with innovations in input methods and keyboard layouts have contributed to the success. Today, many operating systems have versions with updated localization, new local fonts and updated bug-fixed local input methods. Companies such as Microsoft, Google, Red Hat, SAP etc. are taking localization quite seriously and are coming up with tools and technologies to benefit local-language computing.
The need, demand and trends
Internet usage has grown rapidly in the past few years in India. It is no longer viewed as an urban-centric service. True, it offers huge opportunities to rural India. That said, its growth and visibility would be minimal unless computing is accessible in local languages as well. That’s exactly the issue that the government is trying to address. Addressing a conference recently on local language computing in Delhi, Sachin Pilot, Minister of State for Communications and IT, said that, due to lack of local and regional language support, the Internet still was not accessible to the bulk of the country’s population. “The time has come to create local language content and applications so that people in rural India can easily access information and entertainment. Indian entrepreneurs should look at this untapped market,” said Pilot.
What he did not specify was the fact that India’s English-speaking market has been mostly saturated in terms of tapping new Internet users. The next bunch of users are going to be those who are comfortable in local languages. Lalitesh Katragadda, Head of Products, Google India, commented, “In India, most users who know English are using the Internet. So the next level of users will be those who are looking for local languages such as Hindi, Marathi, Tamil and Telugu.”
Another important factor to consider here is the widespread use of mobile devices even in rural areas for content consumption. This is resulting in the need for content availability in regional languages.
What holds the key for success in Indian language computing is standardization, which is a thrust area. According to Swapnil Belhe, Team Leader – Image Processing GIST, CDAC, there have been a lot of proprietary implementations with a lot of vendors making efforts in the field of Indic language computing, which is creating a bit of a mess. “These proprietary formats have to be converted into a standardized one. Standardization will solve this issue,” said Belhe. Government agencies such as the Department of Information Technology (DIT) and Center for Development of Advanced Computing (C-DAC) are working to provide standards for local language computing.
As the first level, Unicode standards have been implemented for all 22 official Indian languages. Until this was achieved, most regional languages in India lacked uniformity, especially on the Web. “With the adoption of Unicode as the global standard, there is great potential to innovate in terms of local language offerings, both at a macro and a micro level,” said Meghashyam Karanam, Product Marketing Manager – Project, Visio & Localization, Microsoft India. The latest version, Unicode 6.1, covers over 100,000 characters and supports over a hundred scripts from across the world, providing a great platform for Indic languages to enter the computing mainstream. However, the way ahead is anything but easy. To implement Indian language computing on a large scale, it should be made mandatory that everyone should adopt the Unicode format.
Another area where standardization is required is related to keyboards. In Indic language computing, keyboard, fonts and rendering issues, among other things, should be addressed properly in order to enable correct mapping. According to Swaran Lata, Country Manager, W3C India, in Indic language computing, proper mapping should be done in order to avoid issues. “In the case of the English language, the mapping is one to one but in the case of Indian languages, more than one or two keystrokes will lead to one correct letter. Here, standardization is important as otherwise the data will be corrupted.”
Government bodies such as C-DAC and Technology Development for Indian Languages (TDIL) are working to address this issue. The government has come up with a standard enhanced Inscript keyboard called Inscript 2. This standard can accommodate characters that are in Unicode 5.2 on the keyboard and ensure that the design can take in all future inclusions
Another change that will have a huge impact on Indian language computing is the arrival of Indian language domain names. Belhe from CDAC said, “Soon you will see .bharat (dot bharat) in Devanagari or in any other Indian script. DIT and C-DAC along with the National Internet Exchange of India (NIXI) are doing a lot of research in developing applications and domain names in local languages.” This will boost Internet penetration at the grassroots level.
Localized solutions, community efforts
As innovations improve and ease of local language computing rises, more vendors will start using Indic language support for building high-quality applications. Today, local language IT applications are increasingly finding users across the board in both the public sector as well as in private enterprises. It will also provide a real and long-term benefit to primary and secondary school students. Another key user segment would be vernacular publishers, who are increasingly looking for end-to-end solutions in the form of desktop publishing and local language computing hardware. Other private enterprises too stand to gain from this trend. For instance, there is a potential growth possibility for local language computing. A vast majority of Indian SMEs function in non-English, local regions. Given the fact that there are over 10 million SME units in India (2011 estimate) with investment of above Rs. 1 lakh crore, the growth opportunity for local language computing is simply mindboggling. India’s SME sector has recorded double-digit growth during last four years and it contributes 40% to industrial production and 6% to the country’s gross domestic product.
To serve these emerging markets, vendors are coming with new solutions as well. For instance, Microsoft’s Project Bhasha program intents to bring together the government, the academia and research institutions along with the local independent software vendors and developers. It aims to localize the company’s products such as Windows and MS Office. Microsoft India’s Karanam said, “We have been developing tools that enable the user to compute in 12 Indic languages as of now. Moreover, we are doing research to build digital infrastructure for additional Indic languages.”
Open source player, Red Hat’s engineering team works with Indian language communities to provide desktops in all official Indian languages. Owned and maintained by Red Hat, the Lohit family of fonts, with near 100% localization, makes input much easier.
Google is also doing its bit. Katragadda said, “Google is investing heavily in generating open source fonts in Indic languages and is encouraging the communities to build open source fonts, which are compatible with mainstream browsers such as Firefox.” Collaborating with open source communities, the company is also working to build an open information system in Indic languages to improve local fonts.
Community efforts are a huge contributor to local language computing growth. Free and open source software projects, volunteer communities and young enthusiasts among others, are playing a vital role. According to Santhosh Thottingal, Software Engineer, Internationalization and Localization, Wikimedia Foundation, talked about an independent attempt to localize tools used under Linux and other free software environments into all Indian languages, “The IndLinux project took many important steps in the initial days of the Indic language desktop. This community worked towards producing a usable Indic language desktop based on GNU/Linux, which has proper fonts, input methods and a bug free rendering engine.”
Similarly, several volunteer-driven communities such as the Fedora Project’s Indic portal, Swathanthra Indian Language Computing Project (Silpa), Sanchaya (a Kannada computing project), Swecha (a project that supports Telugu computing), Swathanthra Malayalam computing, Free Tamil Computing and Marathi Open Source Project are all contributing to regional language computing. These Free and Open Source Software (FOSS) communities also do a great job in the localization of desktop environments into local Indian languages. Localized GNU/Linux is available in most Indian languages. Communities and enthusiasts also work to localize the latest versions of applications such as Gnome, KDE, Firefox, Open Office/LibreOffice into local languages.
However, for the enterprise market, such applications are rare as of now. SAP is an early mover in this space. Navaneet Mishra, Vice-President – Globalization Services, SAP Labs India, said, “Our maiden foray into localizing enterprise applications will help enable Employee-Self-Service (ESS) and Manager-Self-Service (MSS) applications in Hindi. This will be followed by Hindi-enabling the entire ERP Suite, including all transactions and user interfaces.” Recently, SAP also launched three new solutions in this space. The first of the lot is the File Lifecycle Management solution, which aims to turn physical file movement into a digital space with file approval, storage, archiving and search features. The second, Policy Management Framework, enables organizations to conceive, draft, formalize, simulate, launch and monitor a new policy. The third solution is called Address Data Cleansing solution. It will have the capability to parse any address data in India and convert it from a raw form into an accurate result, locatable on digital maps, in a local context. “In the future, we may also look at enhancing this solution with a citizen’s charter that will enable end-user to interact with the officials and apply online for documents such as a birth certificate,” said Mishra.
As technology advances, more Indic language tools will be available both for private use and for enterprise users. One of the exciting developments will be in speech-to-text and text-to-speech translation between English and Indic languages, which will surely open a brand new world of language computing in India.
Challenges and the way ahead
In spite of its obvious potential, Indic language computing faces several challenges. One of the biggest roadblocks is the lack of quality content available. Muthu Nedumaran, Founder, Murasu Systems, said, “In the beginning, we embraced English as the language of computing. So local language content became a problem later.”
Google’s Katragadda added, “To create high quality content in local languages is still a big problem.”
MS Sridhar, Managing Partner, Smart Solutions, said, “Transferring the available content into the local language is a painful process unless the technology improves. We provide a developer toolkit, which includes translation modules, transliteration modules etc. in local languages.”
Lack of awareness is a bugbear. Thottungal of Wikimedia Foundation said, “Training is required for system integrators so that they can configure and install everything that is required to make local language computing possible.” Also, more training facilities are required for common people to learn how to type in the local language. Through awareness programs and mobilization methods, this issue can be tackled. Agencies such as C-DAC and TDIL are already conducting programs to create awareness about the need and necessity for Indic language computing.
As technologies advance, the horizon of capabilities will also rise. This will help both vendors and users in creating and consuming information in local languages. Similarly, as demand increases, we will see more developers using Indic language support for building high-quality applications with significant benefits. It will also pave the way for devices to become cheaper and accessible for the rural population. Along with organizations and communities, the government is also conducting various programs to stimulate the use of local languages in computing. With the computing going mobile, local language computing will become a necessity rather than a nice-to-have. Of course, it will take some time to reach the masses but initiatives should be continued and breakthroughs made. As that happens, experts believe that what is true in English language computing today will come true for all Indian languages in future. When that happen, a rich and vibrant Indic Web will emerge, one that’s highly relevant and hyper local.
If you have an interesting article / experience / case study to share, please get in touch with us at [email protected]