The Indian Institute of Technology Roorkee (IIT Roorkee) has developed the world’s first AI-driven model to transliterate the historic Modi script into the Devanagari script. This pioneering initiative, led by Professor Sparsh Mittal, is not merely a technological marvel—it is a cultural renaissance, a digital revival of India’s forgotten linguistic treasures, and a significant contribution to national missions such as Digital India, Bhashini, BharatGPT, and Azadi Ka Amrit Mahotsav.
The core of this initiative lies in two powerful tools:
- MoScNet (Modi Script Conversion Network): An AI model built on an advanced Vision-Language Model (VLM) architecture that outperforms traditional OCR tools.
- MoDeTrans (Modi-Devanagari Transliterated Dataset): The first and only comprehensive dataset of real handwritten Modi manuscripts, annotated with verified Devanagari transliterations.
Both have been open-sourced on Hugging Face, allowing global researchers, developers, and heritage institutions to build upon and expand this innovation—an unprecedented move in the domain of ancient language AI.
IIT Roorkee develops the world’s first AI model to transliterate the historic Modi script into Devanagari. Led by Prof. Sparsh Mittal, MoScNet & MoDeTrans aim to digitize heritage, support Digital India, Bhashini & BharatGPT. Prof. K.K Pant emphasizes AI for Viksit Bharat. pic.twitter.com/Euc5CFLZQv
— IIT Roorkee (@iitroorkee) July 18, 2025
The Modi script, not to be confused with the surname of any political figure, is a semi-cursive script historically used across Maharashtra and parts of Central and Western India. From royal edicts issued during the reign of Chhatrapati Shivaji Maharaj, to Peshwa-era administrative documents, and British-period land and legal records, the script served as the official medium for centuries.
Despite its rich utility, Modi fell into disuse after the colonial administration imposed Devanagari and English, leading to a decline in its usage and eventually the near-extinction of its scholarly practice. Today, there are an estimated 40 million Modi-script documents languishing in archives, temples, private collections, and government departments—mostly undeciphered and inaccessible, due to the scarcity of expert readers.
Enter MoScNet. Built using cutting-edge Vision-Language AI, the model can see a handwritten Modi character and understand its context, then accurately convert it into its Devanagari equivalent. Unlike Optical Character Recognition (OCR) systems, which fail with handwritten, stylized, or historically degraded texts, MoScNet handles the fluidity, stylistic variation, and cursiveness of the Modi script with remarkable precision.
“This is not just a transliteration tool. It’s an instrument to unlock civilisational knowledge buried in ink and age, trapped in archives and lost dialects,” said Prof. Sparsh Mittal, the Principal Investigator and project lead at IIT Roorkee. “We are building AI that can respect tradition, decode heritage, and open new chapters of Indian history.”
MoScNet’s performance metrics have stunned AI researchers—it shows state-of-the-art accuracy, robust scalability, and can even operate efficiently in low-infrastructure environments where many of these manuscripts are located.
MoDeTrans: A dataset of national significance
The research team didn’t stop at creating a model—they built the very foundation for this technology through MoDeTrans, a curated, high-quality dataset comprising over 2,000 scanned images of authentic Modi manuscripts. These are categorized across three key historical periods:
- Shivakalin (17th century) – the time of Shivaji Maharaj
- Peshwekalin (18th century) – under the Peshwa administration
- Anglakalin (19th century) – during British rule
Each manuscript is accompanied by expert-verified Devanagari transliterations, making this not just a technical resource, but a linguistic heritage archive of enormous scholarly value. The team’s interdisciplinary effort also included notable contributions from young minds:
Harshal and Tanvi, alumni of COEP Technological University (formerly College of Engineering Pune)
Onkar, an alumnus of Vishwakarma Institute of Information Technology, Pune
Their combined efforts led to the creation of a tool that balances academic rigor with real-world usability—a rare feat in AI development for heritage contexts.
“This project represents what Viksit Bharat truly means,” said Prof. Kamal Kishore Pant, Director of IIT Roorkee. “We are using Artificial Intelligence not to replace humans, but to augment our ability to rediscover, reinterpret, and reimagine our roots. MoScNet gives voice to millions of forgotten documents and makes our timeless knowledge accessible to modern society.”
Indeed, the tool holds immense potential for:
- Land and legal record restoration in rural India
- Digitisation of Ayurvedic and medical treatises
- Academic research in linguistics, history, and literature
- Archival enhancement for museums, libraries, and temple trusts
- Citizen science initiatives in local language heritage
The AI model aligns directly with several flagship Indian missions:
- Digital India – by digitizing millions of manuscripts into machine-readable form
- Bhashini – by enhancing multilingual language accessibility
- BharatGPT – by laying the groundwork for large language models in Indic contexts
- National Language Translation Mission (NLTM) – by preserving and transliterating regional languages
On the global stage, the initiative directly supports UN Sustainable Development Goal (SDG) 11.4: “Strengthen efforts to protect and safeguard the world’s cultural and natural heritage.” As climate change, time, and negligence threaten the survival of ancient documents, such AI frameworks can be adapted globally—from the Sanskrit Grantha script in Tamil Nadu, to the Tibetan manuscripts of Ladakh, and inscriptions in Khmer, Pali, and Thai.
In a remarkable show of open-source ethics and digital nationalism, the entire MoScNet model and MoDeTrans dataset have been made freely available on Hugging Face, encouraging community-driven development and scholarly engagement.
“We wanted to build something that is not locked behind a paywall, or buried under bureaucracy,” said Prof. Mittal. “Our commitment is to ethical AI, and that means empowering anyone—researchers, students, technologists, or even curious citizens—to take this forward.”
Already, institutions in Maharashtra and cultural ministries have expressed interest in deploying the tool for heritage recovery and digitization projects.
The IIT Roorkee team envisions expanding the framework to other endangered Indian scripts—including Sharda (Kashmir), Mahajani (Punjab and Rajasthan), and Grantha (Tamil Nadu). They are also exploring integration with voice-based systems, enabling spoken Devanagari outputs from scanned Modi texts—effectively allowing AI to read aloud the voices of history.
With India’s AI revolution moving rapidly, such tools place cultural context at the heart of innovation—a model the world can learn from.



















Comments