India stands at a pivotal point in its attempt to enter the generative AI race. By backing domestic compute capacity, coupled with indigenous models and data, India can forge a leadership position in the AI ecosystem, rooted in inclusivity and digital sovereignty.
Generative Artificial Intelligence (Gen-AI) is a subset of Artificial Intelligence (AI) that enables machines to generate realistic, human-like content. The machine encodes the input into machine-readable forms and maps it to the most closely related patterns in the databases it has been trained on. The final output is a contextually appropriate response based on the input prompt. As Gen-AI technologies reshape industries and human interactions globally, India is gradually carving its place in this evolving landscape.
With more than 700 million internet users and an estimated 600,000 AI professionals, India’s AI sector is super scalable. The growth of a robust technological infrastructure has played a key role in the emergence of about 2,915 AI startups in the country. There is a surge in the popularity of Large Language Models (LLMs), with Chat GPT and Gemini gaining ground in India. Indians currently account for 13.5 percent of global users of Chat GPT, making India its largest user base. DeepSeek, the Chinese open-source model, has been gaining market share due to its cost advantages and compute efficiency. India is currently DeepSeek’s fourth-largest customer base worldwide, with approximately 43.36 million website visits as per a report published in February 2025.
The main issue with foreign LLMs is the inherent bias in their responses. For example, DeepSeek exhibits significant pro-China bias when queried on geopolitical and historical issues ranging from Kashmir to Arunachal Pradesh, as well as the Tiananmen Square incident.
However, despite seemingly high adoption rates, a recent study found that only 31 percent of Indians have used Gen-AI platforms. This could be due to the inability of foreign models to cater effectively to India’s multilingual population, socio-cultural diversity and contextual realities, which often results in culturally inaccurate outputs and hampers last-mile adoption due to reliance on non-local datasets. For instance, Meta AI generated a ‘man with a turban’ four times out of five when asked to generate an image of an Indian, despite India’s vast demographic and cultural diversity. Moreover, low formal and digital literacy create additional barriers to adoption, further deepening the trust deficit caused by the cultural mismatch.
Being the second largest generator of digital data globally, India has the potential to provide high-quality datasets for model training to make AI tools more accessible to underserved population of the country. The government’s dataset platform, AI Kosh, is the starting point of a high-quality data capture initiative with institutions like IIT Bombay contributing over 16 datasets to the platform. It also complements the Bhashini initiative, an Indic translation tool that aims to overcome linguistic, digital and literacy barriers in India.
Government support has generated strong momentum in India’s Gen-AI space. Once fully deployed, Indic LLMs could offer what foreign LLMs currently cannot – accurate, unbiased communication in Indian languages. However, challenges persist as the Indic Gen-AI sector navigates evolving compute supply chains, data governance issues, ethical alignment etc.
Developing Gen-AI for India
Instead of remaining a passive recipient of foreign technological innovation, India is set to pursue sustainable capacity building on a larger scale. As the foreign models are not trained to reflect Indian contexts, India is rethinking its relationship with data. By considering data as an asset rather than a commodity, indigenous model training will ensure that value capture remains within borders. Data generated by Indians must be trained to develop indigenous models. This approach ensures that the economic, diplomatic and intellectual value of AI development remains within India. Moreover, this would set up new indigenous value chains – from research and development (R&D) to model training and deployment. This strengthens India’s stance in technology diplomacy as a leading technology voice from the Global South.
The India AI mission, launched in 2024 with a budget of over INR10,000 crore, has been driving AI innovation. Within this initiative, the Ministry of Electronics and Information Technology (MeitY) has selected start-ups like Sarvam, Soket, Gnani, and GanAI to build India’s indigenous LLM ecosystem.
Complementarily, Soket will develop a 120 billion parameter open-source model optimized for India’s defence, healthcare, and education sectors. Gnani AI will build a 14 billion parameter voice model with fast speech processing, while Gan AI, a company that has previously provided tech solutions to companies such as Google and Amazon, is building a 70 billion parameter ‘superhuman’ text-to-speech model, capable of surpassing human intelligence. In addition to the MeitY selected start-ups, the Department of Science and Technology has also been supporting a multi-modal AI model, BharatGen, to boost public service delivery and citizen engagement.
To train indigenous models, the government launched the India AI Dataset Platform, where diverse, anonymous and non-personal data is stored. AIKosh’s data models, like Hercule-HI and Hercule-BN, ensure translation of English into Hindi and Bangla for better alignment with human judgment in low-resource settings. These models incorporate dialectal nuance to improve translation accuracy. AIKosh aims to enable data discoverability and encourage innovation. Skill development programmes like ‘YuvAI’ and ‘Srijan’, in collaboration with premier institutes and technology companies like Meta, will offer young professionals much-needed exposure to realize their potential.
The prospect of developing a fully open-source and inclusive model could improve public trust in AI as a transformative technology, raising adoption rates of models like Sarvam, which have seen low uptake since launch.
Challenges for Indic LLMs
Comprising 16 percent of the global AI workforce, India does not lack talent. This talent pool has to be used effectively to nourish a solid indigenous AI ecosystem in India. LLMs not optimized for linguistic and cultural diversity can lead to misrepresentation of cultural iconography and vocabulary. India’s decentralization across various sectors can prove as a source of strength. The growing principle of centralized AI tools with foreign influences and datasets risks overlooking local nuance, limiting relevant and representative content generation.
Developing and training LLMs requires significant compute capacity. India’s computing infrastructure was said to account for less than 2 percent of the global capacity in 2024. With compute capacity currently exceeding 34,000 GPUs, India is working on building up its domestic capacity. In parallel, given the high carbon footprint of training LLMs, there needs to be an added focus on the development of critical infrastructure like utilities that can sustain and help scale India’s efforts in AI. Therefore, for Gen-AI to lift off in India, it requires a combined focus on talent, data and compute in addition to the sustainable provision of utilities like energy and power.
The government must focus on key challenges to AI adoption – access, affordability and relevance to local needs. Building on its efforts with LLMs will pave the way for adoption at the last mile. Going forward, many local enterprises such as kiranas, clinics and farms with diverse languages and needs could leverage Small Language Models – a cleaner and more optimized AI version. This will allow AI tools like voice assistants and smart document readers to function seamlessly across sectors like healthcare, agriculture, and social welfare, encouraging faster adoption of AI at the grassroots level.
Artificial Intelligence is poised to become an integral part of daily life. Therefore, the models must be able to communicate effectively, without language and information barriers. India has a golden opportunity to move forward in the Gen-AI space with indigenous innovation that sets a new global paradigm – one that is inclusive, low-cost and efficient. Indigenous models must not be treated as a secondary consideration, but rather as a strategic one.

















Comments