India to use Artificial Intelligence apps to foster translation of numerous official languages for ease of citizens

For a few weeks this year, the villagers of the southern Indian state of Karnataka read out dozens of sentences in their native Kannada languages into an app as a part of its project to build the country’s first Artificial Intelligence (AI) chatbot for Tuberculosis. There are more than 40 million Kannada speakers in India and of its country’s 22 official languages and over 121 languages spoken by 10,000 people or more in the world’s most populous nation.

However, few languages are covered by Natural Language Processing (NLP), the branch of artificial intelligence that enables computers to understand texts and spoken words. Hundreds of millions of Indians are thus excluded from useful information and many economic opportunities.

For AI tools to work for everyone, they need to cater to people who do not speak English, French, and Spanish, said Kalika Bali, the principal researcher at Microsoft Research India. “But if we have to collect as much data in Indian language as went into a large language model like the GPT we would be waiting another ten years. So, we can create layers on top of generative AI models such as ChatGPT or Llama, Bali told the Thomson Reuters Foundation.

The villagers in Karnataka are among thousands of speakers of different Indian languages, generating speech data from the tech firm Karya, which is building datasets for firms such as Microsoft and Google to use in AI models for education, healthcare, and other services.

Bhashini App

The Indian government, which aims to deliver more services digitally, is also building language datasets through Bhashini, and AI-led language translational systems, an AI-based language system that is creating open-source datasets in local languages for creating AI tools. The platform includes s crowed sourcing initiative for people to contribute to the sentences in various languages, validate audio or text transcribed by others, translate texts and label images. Tens of thousands of Indians have contributed to Bhashini.

“ The government is pushing very strongly to create datasets to train large language modules in Indian languages and these are already in use in translation tools for education tourism, and in courts said Pushpak Bhattacharya, the head of Computation for Indian Language Technology Lab in Mumbai.

Economic Value

Of more than 7000 living languages in the world, fewer than 100 are captured in major NLP’s, with English the most advanced. Chat GPT, whose launch triggered a wave of interest in generative AI, is trained primarily in English, while Google Bard is limited to English and the nine languages that Amazon Alexa can respond to, only three of them are non-European: Arabic, Hindi, Japanese.

Grassroots Organisation Masakhane aims to strengthen NLP research in African languages while in the UAE, a new large language model in AI by Time Magazine in September 2023. Crowdsourcing also helps to capture linguistic, cultural, and socio-economic nuances, said Bali.

“But there has to be awareness of gender, ethnic and social bias it has to be dome ethically, by educating workers, paying them, and making a specific effort collect smaller languages she said. Otherwise, it will not scale.

With the rapid growth of AI, there is demand for languages “we haven’t even heard of”, including from academics looking to preserve them, said Karya co-founder Safiya Husain.

Karya works with non-profit organisations to identify workers who are below the poverty line or with an annual income of less than $325 and pays them about $5 an hour to generate data – well above the minimum wage in India.

Workers own a part of the data they generate so they can earn royalties, and there is potential to build AI products for the community with that data in areas such as healthcare and farming, Husain said.

Village Voice

Fewer than 11% of India’s 1.4 billion people speak English. Much of the population is not comfortable reading and writing, so several AI models focus on speech and speech recognition.

Google-funded Project Vaani, or voice, is collecting speech data of about 1 million Indians and open-sourcing it for use in automatic speech recognition and speech-to-speech translation.

Bengaluru-based Ek Step Foundation’s AI-based translation tools are used at the Supreme Court in India and Bangladesh, while the government-backed AI4Bharat centre has launched Jugalbandi, an AI-based chatbot that can answer questions on welfare schemes in several Indian languages.

The Bot, named after a duet where two musicians riff off each other, uses language models from AI4 Bharat and reasoning models from Microsoft and can be accessed from WhatsApp, which is used by 500 million people from India. Gram-Vaani, or the Voice of the Village, a social enterprise that works with farmers, also uses AI-based chatbots to respond to questions on welfare benefits.

Topics: Artificial Intelligence ai Bhashini Gram Vaani Project Vaani

Comments

The comments posted here/below/in the given space are not on behalf of Organiser. The person posting the comment will be in sole ownership of its responsibility. According to the central government's IT rules, obscene or offensive statement made against a person, religion, community or nation is a punishable offense, and legal action would be taken against people who indulge in such activities.

India to use Artificial Intelligence apps to foster translation of numerous official languages for ease of citizens

Madhya Pradesh: Retired sanitation worker Latif Khan, arrested for urinating on the wall of a temple in Ujjain

High ‘good cholesterol’ is linked to risk of dementia: Study

Related News

Gita Jayanti: The original prompt – Rethinking AI through Gita

Digital technology — Boon or Bane?

Revolutionary Indian AI system OncoMark redefines cancer understanding through molecular hallmark mapping

Fact Check: AI-generated video falsely quotes General Upendra Dwivedi

From Swadeshi spirit to smart factories, India is redefining global manufacturing of $1 trillion dream

From chalkboards to chipsets, SOAR is transforming skilling landscape of Bharat through tech-powered future

Comments

Latest News

Cultural ties strengthened: PM Modi presents Putin with Bhagavad Gita, chess set, and silver horse

Bihar to get ‘Special Economic Zones’ in Buxar and West Champaran

Andhra Pradesh: AP Dy CM Pawan Kalyan reacts to Thirupparankundram row, flags concern over religious rights of Hindus

India-Russia Summit heralds new chapter in time-tested ties: Inks MoUs in economic, defence, tourism & education

DGCA orders probe into IndiGo flight disruptions; Committee to report in 15 days

Kerala: Widow of BJP worker murdered in 1995 steps into electoral battle after three decades at Valancherry

Scripting economic bonhomie: Russian investors gain access to Indian stocks, Sber unveils Nifty50 pegged mutual funds

Rahul Gandhi UK Citizenship Case: Congress supporters create ruckus in court; Foreign visit details shared with judge

Kerala: HC slams CPM-controlled Kochi Devaswom Board for deploying bouncers for crowd management during festival

Fact Check: Rahul Gandhi false claim about govt blocking his meet with Russian President Putin exposed; MEA clears air