In today’s rapidly evolving world of technology, the need to modernize legacy software systems has never been more pressing. Legacy systems are often outdated and difficult to maintain, posing challenges for organizations that need to innovate or scale. Modernizing these systems is a complex process, often hindered by Technical Debt (TD), which affects the costs of maintaining and adapting legacy code. TD is multi-faceted, including architectural, code, design, and even documentation debt, all of which can impede modernization. This thesis proposes the design and implementation of a modular, extensible system that leverages Large Language Models (LLMs), modern protocols like the Model Context Protocol, and graph-based retrieval to identify, contextualize, and prioritize technical debt in simulated legacy software systems. It focuses on delivering stakeholder-oriented insights, particularly in MLOps environments where software and machine learning components co-evolve and accumulate complex forms of debt across code, data pipelines, and model behavior. By ingesting diverse artifacts (e.g. code analysis outputs, issue trackers, team communications) the system constructs an LLM-navigable knowledge graph that enables both technical and non-technical stakeholders to explore and reason about technical debt through natural language queries. This approach aims to improve transparency, support informed decision-making, and guide prioritization of refactoring efforts in complex, evolving systems.
Project information
In progress
Master
Muhammad Raghib
2026-006