Image of peppers growing on a vine.

Project Gecko

As generative AI transforms productivity and access to knowledge, its benefits must extend beyond English-speaking and Western-centric contexts. Historically, each industrial revolution has introduced equity gaps profoundly impacting societal divides. In this fourth industrial revolution driven by AI, there is a valid concern that these disparities could widen further. The challenge lies in how this technology navigates the complexities of non-English languages and cultures. However, at Microsoft Research we are committed to building AI models and copilots that are equitable by design, capable of serving diverse linguistic and cultural contexts, and focused on closing, not widening, global opportunity gaps.

Project Gecko is a cross-lab Microsoft Research initiative advancing the Equitable Generative AI Project by developing equitable models and trusted multilingual copilots — designed for population scale use that can serve entire communities.

Co-led by the Microsoft Research Accelerator, Microsoft Research India, and Microsoft Research Africa, Nairobi, Gecko explores how AI can be adapted and deployed widely in low-resource settings. The project integrates research in multilingual small and large language models, multilingual speech models, synthetic data creation for low resource settings, HCI and grounded evaluation. To do this we deploy a blend of NLP, ethnographic design, and machine learning methods to create new multidisciplinary paradigms for building human-centered AI— with real-world deployments spanning agriculture and education.

Why This Matters:

Large language models have the potential to transform how people access information and services. Yet, most are trained on high-resource languages and reflect dominant cultural contexts. Gecko seeks to reverse this dynamic — building AI systems from the ground up, shaped by the knowledge, languages, and modalities of the global majority. Achieving population-scale impact requires a fundamental rethinking of how AI is localized, evaluated, and deployed.

Core Research Areas:

  • Multimodal grounding using trusted community videos
  • Speech-first interfaces for oral-first language users
  • Local adaptation of LLMs, SLMs, and speech models to support a range of languages (including Kiswahili, Hindi, Kikuyu) and cultural contexts in India and East Africa
  • Synthetic data creation for low resource settings
  • Human-centered evaluation frameworks for trust, relevance, and cultural alignment
  • Ethnomethodologically-informed research and design to co-create AI experiences with local users

Initial Focus:

The initial phase of Project Gecko centers on agriculture in East Africa and South Asia, where we are investigating how trusted, multilingual Copilots can operate at population scale. In partnership with Microsoft Research Africa, Nairobi, Microsoft Research India, and Digital Green (opens in new tab) — a global development organization that partners with governments and grassroots organizations to build community-driven digital infrastructure for agriculture — we are exploring the infrastructure, modeling strategies (particularly around small language models), and evaluation protocols necessary to enable scaled deployment.

This milestone involves Farmer.Chat (opens in new tab), an AI-powered web app assistant developed by Digital Green that enables smallholder farmers to engage with community-contributed agricultural video content via a speech-first interface tailored for use in Kiswahili and Kikuyu.

Initial pilots in Kenya demonstrate measurable improvements in response quality, usability, and user trust — offering early signals for how community-grounded, multilingual Copilots might perform in similar contexts.

Through this effort, we are also evaluating how the VeLLM platform can serve as a replicable playbook for grounded Copilot development. By analyzing what works in this agricultural context, we aim to identify generalizable design patterns, tools, and infrastructure that can be extended to future domains such as education and health — helping to enable scalable, locally relevant Copilots across a wide range of linguistic and cultural settings.

Our Mission:

To accelerate generative AI adoption in regions where low-resource languages, oral knowledge, and community media are central to daily life.

Platform Foundation:

Project Gecko is built on VeLLM (uniVersal Empowerment with LLMs)—a platform developed by Microsoft Research India that supports:

  • Multilingual and multimodal Copilot development
  • Grounding in culturally contextual and community-contributed data
  • Principled evaluation of trust, utility, and equity

Key Highlights:

  • Target Audience: Project Gecko focuses on enabling population-scale Copilots for the global majority — prioritizing support for low-resource languages and content grounded in oral and video-based knowledge.
  • Core Platform: Project Gecko leverages VeLLM, a platform developed by MSR India, to support multilingual, multimodal Copilot creation grounded in culturally relevant data. VeLLM is designed to be a replicable foundation for Copilots across domains and geographies.
  • Initial Phase: The first milestone is centered in East Africa and South Asia, in collaboration with Microsoft Research Africa, Microsoft Research India, and Digital Green. The pilot explores agricultural Copilots using community video, small language models, and speech-first interfaces via the farmer.chat app.
  • Research Focus: Spans applied research in multimodal retrieval, small language models, grounding and trust evaluation, and participatory design. The goal is to identify scalable patterns for Copilot development across underserved contexts.
  • Looking Ahead: Project Gecko reflects a broader commitment within Microsoft Research: to ensure that the next generation of AI is not only powerful — but globally inclusive, culturally relevant, and shaped by the communities it aims to serve.