gsoc-2026

GSoC 2026 Week 0: Kicking Off with DBpedia

Onboarding, model benchmarking, and LangGraph selection for Amharic ontology mapping.

#gsoc-2026 #week-0 #nlp #knowledge-graphs #amharic #dbpedia

This summer I'm working on GSoC 2026 with DBpedia to extend the Amharic DBpedia Chapter. The goal is to build an agentic system that maps Amharic text entities to their correct DBpedia ontology classes — a critical step toward making Amharic knowledge accessible in the global knowledge graph.

The project leverages Afro-XLM-R, a multilingual transformer optimized for African languages, fine-tuned with LoRA and orchestrated through LangGraph.

Had my first onboarding meeting with mentors Prof. Dr. Ricardo Usbeck, Andargachew, Tilahun, and Hizkiel. We introduced ourselves and locked in weekly sync meetings for Fridays at 2:00 PM. Joined all primary communication channels: Slack, WhatsApp, and Microsoft Teams.

My mentors have rich backgrounds — Prof. Dr. Ricardo Usbeck leads AI research in Germany at Leuphana University, Andargachew is a lecturer at Addis Ababa University specializing in knowledge graphs and NLP, Tilahun is a PhD researcher at Leuphana working on hybrid QA systems for low-resource languages, and Hizkiel is a PhD candidate at Paderborn University focused on NLP and Digital Humanities.

Conducted a deep-dive revision of the Amharic DBpedia Chapter paper and analyzed the three property-retriever models by the DICE-Research team. Refreshed core knowledge of BERT and efficient fine-tuning like LoRA.

Performed model benchmarking on Kaggle using GPU to evaluate mBERT, XLM-R, and Afro-XLM-R. Validated Afro-XLM-R as the optimal base model for Amharic ontology mapping.

Selected LangGraph as the primary framework for the agentic orchestration layer — used in production by companies like LinkedIn with strong community support.

Over the next 11 weeks I will be posting weekly updates. Next up: diving deeper into the Afro-XLM-R fine-tuning pipeline and setting up the LangGraph agent architecture.

Natnael Yohanes

Backend AI Engineer focused on ML systems, system design, distributed systems, and blockchain infrastructure.