MikavMikav

Research

benchmarkevaluationmalayalamDraft
MalayalamCultureBench: A Benchmark for Evaluating LLM Understanding of Kerala's Art, History, and Traditions
Introduces a new evaluation benchmark covering Kerala-specific culture (art forms, festivals, oral traditions, history), testing existing LLMs against the Mikav fine-tuned model to quantify cultural-awareness gaps.
Hrudu Shibu·
datasetcorpusmalayalamDraft
Building an Open Malayalam Culture Corpus: Collection, Cleaning, and Licensing of Low-Resource Heritage Data
Documents the methodology for sourcing, cleaning, and licensing Malayalam text and cultural heritage data (manuscripts, oral history, festival/art records) into an open, reusable dataset — addressing IP ownership and low-resource data challenges.
Hrudu Shibu·
human-in-the-loopverificationtrustDraft
Community-Verified AI: A Human-in-the-Loop Framework for Preserving Regional Cultural Knowledge
Proposes a verification methodology pairing domain experts (cultural institutions, practitioners) with AI-generated content to reduce hallucination risk and ensure trustworthy representation of niche cultural knowledge.
Hrudu Shibu·
system-paperarchitectureopen-sourceDraft
Mikav: An Open-Source AI Copilot Bridging Cultural Heritage and Creative Entrepreneurship in Kerala
Presents the full Mikav system architecture (dataset → model → copilot → dev platform), deployment approach, and a case study from the SparkX cohort showing real-world usage and outcomes.
Hrudu Shibu·
fine-tuningllmmalayalamDraft
Fine-Tuning Open LLMs for Native Malayalam Cultural Understanding: A Comparative Study
Compares fine-tuning approaches (Llama, Qwen, Gemma) on the Malayalam culture corpus, evaluating language fluency and cultural-knowledge accuracy against base/baseline models.
Hrudu Shibu·