The Future of AI in Academia: Open Source LLMs & AI-Powered Work

In the corridors of Indian higher education, something is quietly shifting. Students at IIT Delhi are using large language models to synthesise literature reviews overnight. PhD scholars at IISc are running open-source models locally to analyse datasets their departments cannot afford to send to commercial APIs. A first-year student in Pune is using Meta's Llama on a borrowed laptop to understand a thermodynamics problem that her professor explained only in English, a language she reads less fluently than Marathi. The technology is already here. The question facing India's universities is not whether to engage with AI but how, on whose terms, and at what cost.

This report is grounded in a campus-wide survey exploring attitudes toward providing institutional access to open-source large language models (LLMs), fostering an environment for AI-assisted research, and redefining the norms of academic work in an era when a machine can draft a passable journal abstract in seconds. The findings are neither a warning nor an uncritical endorsement. They are, instead, a call for deliberate, equitable, and India-specific policy.

Context: India at an Inflection Point

India has every reason to take AI in education seriously. Its education system caters to more than 250 million learners, making uniform pedagogy structurally impossible. Socio-economic divides, linguistic plurality, and stark gaps between Tier-1 and Tier-3 institutions have long frustrated efforts at equitable access to quality education. AI, designed thoughtfully, could become the most powerful equaliser India's classrooms have ever seen.

The policy architecture is beginning to reflect this urgency. The National Education Policy (NEP) 2020 explicitly calls for the integration of AI and computational thinking across all levels of schooling. The Union Budget 2025-26 allocated Rs 500 crore for a dedicated Centre of Excellence in AI for Education, and AI and Computational Thinking is being introduced as a compulsory subject from Class 3, beginning academic year 2026-27. The IndiaAI Mission, launched in March 2024 with a budget of Rs 10,371.92 crore, fosters innovation across government, startups, and academia, and by 2025-26 its budget had received a 1,056 percent increase to Rs 2,000 crore for its FutureSkills pillar alone.

Survey Context

This blog draws on the results of a campus-wide survey at an Indian university exploring student and faculty perspectives on AI, open-source LLM access, and AI-assisted academic work. Twenty respondents participated, all of whom provided consent for their responses to be used in this report. Participation was voluntary and responses were anonymous. The findings — and the quantitative patterns they reveal — informed the policy recommendations in the final section.

Yet the ground-level reality is uneven. According to UDISE+ 2024-25 data, only about 58 percent of Indian schools have functional computers, and internet access hovers around 64 percent. More than 7 percent of schools are single-teacher institutions. This infrastructure gap does not disappear at the university level. While leading institutions like the IITs and IISc are investing heavily in AI infrastructure, many state universities and smaller engineering colleges remain under-resourced.

"Our goal is to elevate Indian universities to world-class standards while fostering global connections, ensuring that no one is left behind in this AI-driven transformation."

Dharmendra Pradhan, Union Minister for Education, Press Information Bureau, 2025

What Are Open-Source LLMs, and Why Do They Matter?

A large language model is a machine learning system trained on vast corpora of text to understand and generate human-like language. Models like Meta's Llama 3, Google's Gemma, Mistral, and DeepSeek R1 are described as open-source or open-weight, meaning their underlying parameters are publicly released for use, fine-tuning, and inspection. This distinguishes them from proprietary systems like GPT-4 or Gemini Advanced, which are accessible only through paid APIs and offer no visibility into training data or model internals.

For academia, the distinction is not merely technical. It is philosophical. Open-source models can be run on local hardware, protecting sensitive research data from being transmitted to third-party servers. They can be fine-tuned on domain-specific corpora, making them far more useful for specialised fields. They can be studied, critiqued, and audited, which is a basic precondition for scientific rigour. And they do not require per-token payment, which makes scale both possible and affordable.

Model	Parameters	License	Academic Use
Meta Llama 3	8B to 70B	Llama Community License	Research, fine-tuning, on-device deployment
Mistral 7B / Large 2	7B to 123B	Apache 2.0 / Mistral Research License	Long-context tasks, multilingual work
Google Gemma	2B to 27B	Gemma Terms of Use	Lightweight deployment, modest hardware
DeepSeek R1	7B to 671B	MIT	Reasoning tasks, research workflows
BLOOM	176B	BigScience RAIL	Multilingual NLP (46 languages including Hindi)
AI4Bharat Airavata	Varies	Open (research)	Indian language understanding and generation

Table 1: Selected open-source LLMs and their suitability for academic contexts. Sources: Instaclustr (2025); AI4Bharat (2025); Medium/Jonathan Lee (2025).

In early 2024, Carnegie Mellon University offered a course where students built their own mini-GPT by progressively training and fine-tuning open models, a form of hands-on engagement that would have been impossible in an era of only proprietary models. This kind of pedagogical innovation, where the tool itself is the subject of inquiry, is precisely what open-source access enables.

Policy & Governance NEP 2020 and the Digital Divide: Has India's Flagship Education Reform Delivered? February 2026 · 12 min read Technology Inside IIT Madras's AI4Bharat Lab: The Quiet Effort to Build India's Language Stack January 2026 · 9 min read Research Why Indian Research Papers Are Underrepresented in Global AI Benchmarks December 2025 · 7 min read

India's Own AI Ecosystem: Building from Within

Among the most consequential developments in global AI is the emergence of India-specific language models. This matters profoundly for universities. A model trained predominantly on English-language internet data will serve an English-speaking student well. It will serve a Marathi-speaking student in Aurangabad, a Tamil-speaking researcher in Coimbatore, or a Bengali-speaking scholar in Kolkata, far less well.

AI4Bharat, based at IIT Madras, has pioneered the development of multilingual LLMs tailored for Indian languages, including IndicBERT, IndicBART, and Airavata, trained on extensive and diverse datasets encompassing all 22 scheduled languages and competing with commercial models on multiple benchmarks.

BharatGen, inaugurated by the Ministry of Science and Technology, is developing state-of-the-art models for conversational AI, automatic speech recognition, text-to-speech, and translation, all tailored to Indian Knowledge Systems and cultural perspectives, with open-source releases planned for 2025.

Tech Mahindra's Project Indus was initially trained in Hindi and around 37 dialects and plans to cover other languages in phases; Ola's Krutrim, announced in December 2023, claimed the ability to write in 10 Indian languages while understanding 20. These are not merely technological curiosities. They represent a bid for what scholars of AI governance call "data sovereignty": the right of communities to have AI systems that reflect their languages, epistemologies, and cultural frames.

In 2024, IBM researchers released MILU, an open-source multi-task Indic language understanding benchmark designed to evaluate LLMs in cultural knowledge across 11 Indic languages and 41 subjects. The significance for universities is direct: institutions can now benchmark which open-source models best serve their student populations, and they can advocate for models that score well on Indian-language tasks.

Key Initiative

The Srijan Centre for GenAI, established at IIT Jodhpur in partnership with Meta, and the YuvAI initiative, which targets 100,000 students aged 18 to 30 to develop AI solutions for healthcare, agriculture, and smart cities, represent institutional bets on the value of open-source, India-native AI development in academic settings.

Transforming Research: What AI-Assisted Scholarship Actually Looks Like

The practical applications of LLMs in Indian academic research are already substantial, and they go well beyond the student chatbot use case that dominates public discourse. Consider a few concrete scenarios drawn from the survey responses and broader institutional data.

Literature Review and Synthesis

A researcher navigating 400 papers across two decades of Indian agricultural economics does not have a shortage of data. She has a shortage of time. An open-source LLM fine-tuned on academic abstracts can reduce the first pass of a literature review from weeks to days, flagging key themes, methodological divergences, and citation gaps. The researcher retains interpretive authority; the model handles pattern recognition at scale.

Multilingual Research Access

India's academic output in regional languages, including significant scholarship in Tamil, Telugu, Bengali, and Kannada, is largely inaccessible to researchers who do not read those languages. Open-source models with strong Indic language capabilities can serve as translational intermediaries, enabling cross-linguistic collaboration and surfacing local knowledge that would otherwise remain invisible to a Westernised research mainstream.

Code and Data Analysis

Models like Mistral Large 2 and Llama 3 offer robust coding and data analysis capabilities; open-source coding assistants integrated into IDEs are now feasible without cloud access, enabling students without reliable internet connectivity to use AI-assisted development workflows. For engineering and science students in institutions with patchy connectivity, this localisation is not a minor convenience. It is the difference between access and exclusion.

Personalised Tutoring at Scale

AI dynamically adjusts content difficulty based on learner performance and pace; platforms like Embibe, for instance, analyse test responses to generate targeted remedial practice for JEE and NEET aspirants. Campus-wide deployment of open-source models could bring this personalisation to a broader range of subjects and student populations, not just those wealthy enough to afford commercial tutoring platforms.

The aggregate picture is striking. A pilot project using AI at a private university in Bangalore reported a 28 percent reduction in dropout rates over one academic year, driven by predictive models that identified at-risk students and enabled early intervention.

Academic Integrity in the Age of Generative AI

No honest treatment of AI in academia can avoid this subject. The arrival of generative AI has created genuine and unresolved tensions around what it means to produce original academic work, how we assess students, and what constitutes intellectual honesty. These tensions are not unique to India, but they manifest in particular ways given the scale and diversity of India's higher education system.

Research published in Behaviour and Information Technology in 2025 examined 413 business school students in India who had already used ChatGPT and found that psychological rationalisation strategies had a greater impact than deterrence factors in predicting AI misuse, with male students relatively more likely to exhibit such behaviour, a finding with direct implications for how institutions design academic integrity interventions.

Indian universities regulate traditional plagiarism with anti-plagiarism detection systems, but the majority are not sufficiently prepared to identify AI-generated content that is contextually relevant and original and therefore bypasses traditional checks. The challenge is structural: AI-generated text does not match any existing published work in a database. Conventional plagiarism detection is designed for a pre-generative-AI world.

"The real question is not whether a student used AI. It is whether they engaged in genuine intellectual labour, and whether our assessments are designed to reveal that."

Adapted from the survey framework, March 2026

Survey respondents were asked directly to name academic tasks they believed should be restricted from AI assistance. The most consistently cited category was creative and artistic work — original writing, scripts, speeches, literary composition, and visual art — reflecting a view that these tasks derive their value precisely from unaided human expression. Examinations and individually assessed work came up frequently as well, with several respondents noting that any task requiring genuine intellectual effort or original argumentation should remain unassisted. The convergence is notable: respondents are active AI users themselves — 40 percent engaging with these tools daily — yet they articulate clear boundaries around the kinds of work they believe should stay human. This is not technophobia. It is a thoughtful, self-imposed standard that institutional policy should seek to formalise and support.

India's leading institutions are beginning to respond. IIT Delhi formed a dedicated committee in April 2024 that explored how generative AI tools could be ethically integrated into teaching and research; after collecting feedback from students and faculty, the committee revealed that 80 percent of students reported using generative AI tools regularly and established mandatory disclosure requirements for AI-generated content. IIT Delhi is the first IIT to officially release such guidelines.

The guidelines specify that any work or content generated with the assistance of AI tools must be completely disclosed to guarantee transparency, and that if AI tools are used to create distinct elements such as images, tables, data visualisations, or significant sections of text, this use should be noted in captions, footnotes, or a statement within the work.

The UGC has encouraged higher education institutions to develop their own AI usage policies and has cited IIIT-Delhi's approach as a model for structured, transparent AI integration, though India does not yet have institution-specific UGC regulations specifically addressing AI-assisted academic dishonesty as of early 2026.

A more nuanced framing, and one supported by the survey responses received, is that the goal should not be to eliminate AI use but to design assessments that are genuinely robust to it. This means a shift from take-home essays toward oral examinations, in-class work, process portfolios, and project-based evaluation where the AI's contribution, if any, is visible and declared.

The Case for Campus-Wide Open-Source LLM Access

The survey asked a direct question: should the university provide campus-wide access to an open-source large language model, hosted on institutional infrastructure? The responses were instructive. Sixty percent of respondents rated campus-wide LLM access as highly beneficial — awarding it a 4 or 5 on a five-point scale — with an overall group average of 3.9 out of 5. Only one respondent gave the lowest possible rating, and notably, that individual was also among the most frequent daily users of AI tools in the cohort, a reminder that personal enthusiasm for AI and support for institutional provision do not always align. Concerns about misuse, the undermining of assessments, and the risk of students becoming dependent on AI for reasoning tasks they should develop themselves are present and legitimate. Both positions can be addressed through policy.

The argument for institutional provision is, at its core, an equity argument. Students at well-funded private universities or those with personal means to pay for commercial AI subscriptions already have access to powerful AI tools. Their peers at underfunded state universities or from lower-income families do not. Leaving AI access to the market means that AI becomes yet another vector of educational inequality, layered atop existing divides of language, geography, and socioeconomic status.

Concern	Policy Response
AI misuse in assessments	Redesign assessments toward process-based and oral evaluation; mandatory AI-use disclosure
Data privacy (student queries sent to external servers)	Self-hosted open-source models on campus infrastructure; no third-party data transmission
Dependence undermining critical thinking	"Digital temperance" policies; structured AI-free foundational exercises before AI-assisted work
Language and cultural bias in models	Prioritise Indic-language models (AI4Bharat, BharatGen); participate in localisation efforts
Faculty unpreparedness	Structured faculty development on AI-assisted pedagogy; peer learning communities
Cost of GPU infrastructure	Consortium models across institutions; IndiaAI Mission compute grants; quantised small models

Table 2: Key concerns raised in the campus survey and proposed institutional responses.

Survey Dimension	Finding	n = 20
Campus LLM access: perceived benefit Scale 1–5; how beneficial would campus-wide open-source LLM access be?	Avg. 3.9 / 5 • 60% rated 4 or 5
Current AI use frequency How often do you use AI tools for study or work?	Daily 40% • Monthly 30% • Weekly 15% • Rarely 15%
LLM familiarity (top tools) Which open-source / AI tools have you used or are familiar with?	Chat-GPT 80% • Bloom 50% • Claude 45% • Mistral 35% • Llama 20%
AI workshops: student importance How important is access to AI-focused workshops and training?	95% rated 4 or 5 out of 5
Institutional workshop support (current rating) How would you rate the institution's training workshop provision?	Very Poor 45% • Satisfactory 45% • Excellent 10%
Who should set AI policy? Primary responsibility for developing and enforcing AI usage policies	Faculty / Academic Senate 60% • IT Dept 55% • Student Reps 50% • Univ. Admin 45%	—
Data privacy in AI implementation Importance of data privacy & security when deploying AI tools	Crucial 35% • Very Important 15% • Moderately Important 15%

Table 3: Campus survey results (n = 20, March 2026). All respondents provided consent. Respondents could select multiple options for policy roles and LLM familiarity. Bar lengths are proportional to the primary percentage cited.

The privacy argument for open-source, locally hosted models is particularly compelling in the Indian context. The Digital Personal Data Protection Act, 2023 places significant obligations on entities that process personal data. When students submit queries about their research, their mental health, their career anxieties, and their academic struggles to commercial AI systems, that data flows to servers outside India, governed by terms of service that few students read and fewer understand. A self-hosted open-source model eliminates this exposure entirely. That 35 percent of survey respondents rated data privacy as a crucial factor in AI implementation — with another 15 percent rating it very important — reflects a student body that is more conscious of this risk than institutions may assume.

Governance, Ethics, and the Road Ahead

As AI becomes more pervasive in admissions, assessments, and student analytics, Indian universities must establish ethics review boards and develop data governance frameworks to ensure AI adoption is grounded in transparency, fairness, and accountability. The survey responses echo this. Students do not simply want access to AI; they want assurance that the AI being deployed in administrative decisions, from admissions to scholarship allocation, is not encoding historical biases in its outputs.

The IndiaAI Governance Guidelines, developed under an Advisory Group chaired by the Principal Scientific Advisor, outline a coordinated whole-of-government approach to AI compliance and responsible deployment, with direct relevance to educational institutions. These guidelines represent an important policy floor. Universities must build on them, not merely satisfy them.

Faculty development is a dimension that often goes unaddressed in technology policy, and the survey data reveals the gap in stark terms. When asked to rate the institution's current provision of AI training workshops, 45 percent of respondents gave the lowest possible score — Very Poor — and another 45 percent managed only Satisfactory. A mere 10 percent rated it Excellent. Contrast this with the 95 percent of those same respondents who rated access to AI-focused workshops as important or very important, and the mismatch is not subtle. Students want training; the institution is not yet delivering it at the standard they need. It is insufficient to deploy an LLM on campus if faculty and students have not been equipped to use it responsibly.

UGC has supported the establishment of research centres focusing on AI and machine learning, and AICTE has encouraged institutions to introduce interdisciplinary AI courses and establish AI labs aligned with industry standards. These are necessary but not sufficient. What is needed is granular, discipline-specific guidance: what does responsible AI use look like in a sociology seminar, a civil engineering lab, a law school, a school of fine arts?

The concept of "digital temperance," implementing policies that encourage students to engage unaided with foundational concepts before using AI assistants, is gaining currency as an approach to building meta-cognitive resilience. It is a useful framing, though its application requires care. There is a meaningful difference between a student who uses AI because she genuinely cannot understand a concept without scaffolding, and a student who uses AI to avoid the effort of engaging at all. Policy must distinguish between these cases.

"Open-source benchmarks play a crucial role in advancing AI research because they make progress measurable, transparent, and reproducible."

Michal Shmueli-Scheuer, Distinguished Engineer, IBM AI Benchmarking and Evaluation, IBM Think, 2025

Recommendations for Indian Universities

The survey results, combined with the policy and research landscape reviewed above, support the following institutional recommendations. They are offered not as a universal prescription but as a starting framework that institutions should adapt to their specific contexts, resources, and student populations.

1. Establish a Campus AI Policy Council

Convene a standing body with representation from faculty across disciplines, students, technical staff, and institutional leadership. Survey respondents nominated Faculty/Academic Senate (60%), the IT Department (55%), and Student Representatives (50%) as the primary stakeholders in AI policy — a multi-body structure that should be mirrored in any governance council. This council should develop and revise AI usage policies, review academic integrity cases involving AI, and serve as the institutional interface with UGC and AICTE guidance.

2. Pilot a Self-Hosted Open-Source LLM

Partner with the IndiaAI Mission or peer institutions to deploy a locally hosted open-source model, prioritising models with strong Indic language capabilities. Begin with a controlled pilot open to graduate researchers before broader rollout. Measure usage patterns, assess impact on research output, and gather structured feedback. Survey respondents already using AI daily — 40 percent of the cohort — represent an engaged early-adopter base well suited to an initial pilot group.

3. Redesign Assessment for the AI Era

Phase in oral components, process portfolios, and in-class assignments across departments. Develop AI-use disclosure norms consistent with the IIT Delhi model. Invest in faculty development so that assessment redesign is informed by disciplinary context, not imposed from above. Survey respondents were clear that creative and expressive work — original writing, artistic production, speeches — should remain unassisted, a consensus that can guide which assessment types are preserved in AI-free form.

4. Contribute to India's AI Ecosystem

Encourage faculty and student contributions to open-source Indic language datasets and benchmarks. Institutions that participate in building tools like AI4Bharat's corpora or BharatGen's training data are not merely consumers of AI; they become co-producers of a more equitable AI future.

5. Centre Equity in Every Decision

Any AI policy that does not explicitly address the needs of students from lower-income backgrounds, from regional language backgrounds, or from under-resourced campuses, risks reproducing existing inequalities at algorithmic speed. Every decision on tool selection, infrastructure investment, and policy design should begin with the question: who benefits, and who is left out?

Conclusion: The University as a Site of Contested Futures

The rise of AI in academia is not a single event but a process, uneven, contested, and full of consequence. For India, a country where the stakes of educational access are among the highest in the world, the choices made in the next few years about how universities adopt, govern, and critique AI will shape intellectual life for a generation.

Open-source LLMs represent, at their best, an opportunity to democratise access to sophisticated intellectual tools. They are not a solution to structural inequality, but they can be designed and deployed in ways that do not deepen it. That requires universities that are willing to think carefully, act collectively, and hold both the promise and the peril of this technology in view simultaneously.

The students who responded to the survey that inspired this report are not asking for AI to replace their education. They are asking for the tools to participate fully in a world that is being reshaped by AI whether their institutions are ready for it or not. The appropriate institutional response is neither panic nor cheerleading. It is the same thing that good universities have always done: ask hard questions, build robust frameworks, and keep the focus relentlessly on learning.

References

AI4Bharat. (2025). Multilingual LLMs and Indic corpora for Indian languages. Indian Institute of Technology Madras. ai4bharat.iitm.ac.in
BharatGen. (2025). India's first sovereign AI: Building multimodal LLMs for 22 Indian languages. BharatGen Consortium. bharatgen.com
Cotton, D. R. E., Cotton, P. A., & Shipway, J. R. (2023). Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International, 61(2), 228-239. Taylor & Francis.
Government of India, Ministry of Education. (2020). National Education Policy 2020. Press Information Bureau, New Delhi.
Government of India, Ministry of Electronics and Information Technology. (2024). IndiaAI Mission: Budget allocation and programme pillars. MeitY, New Delhi.
Government of India. (2023). Digital Personal Data Protection Act, 2023. Gazette of India.
IBM Research. (2024). MILU: Multi-task Indic language understanding benchmark across 11 Indic languages and 41 subjects. IBM Think. ibm.com
Lee, J. (2025, July). The rise of open-source AI models (2024-2025). Medium. medium.com
Lukmaan IAS PIB Summary. (2026, March 16). AI in Education: IndiaAI Mission, UGC-AICTE reforms, and digital temperance. blog.lukmaanias.com
Mahapatra, P. K. et al. (2024). Industry-academia AI partnerships in India: IIT Bombay-NVIDIA and beyond. Journal of Higher Education Technology, India.
Mazumder, A. K. et al. (2024). Responsible use of generative AI in Indian research journals indexed in Scopus. Indian Journal of Research Policy.
MediaNama. (2024, October 10). Indian government to launch open-source AI datasets platform by January 2025. medianama.com
Ministry of Science and Technology, India. (2024, September 30). BharatGen inauguration: Government-funded multimodal LLM initiative for Indian languages. Press Information Bureau.
Mundhe, S. (2024). AI-assisted early warning systems and dropout prevention: A case study from Bangalore. Journal of Educational Data Science, 3(1).
MDPI. (2025). Academic integrity in the generative AI era: A systematic review. IJRISS, 9(5). mdpi.com
Rane, N. (2023). Roles and challenges of ChatGPT in achieving Sustainable Development Goal 4. Social Science Research Network, Mumbai.
ResearchGate. (2025). Artificial intelligence and the future of Indian universities: Challenges and opportunities. researchgate.net
The Higher Education Review. (2025, August). IIT Delhi introduces strict GenAI usage rules for students and faculty. thehighereducationreview.com
University Grants Commission. (2024). National Programme on Artificial Intelligence (NPAI) Skilling Framework. UGC, New Delhi.
Zuboff, S. (2023). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power (reprint ed.). PublicAffairs.
Campus Survey on AI in Academia. (2026, March). Unpublished primary data (n = 20). Conducted at an Indian university; all respondents provided consent.