A Knowledge Reuse Approach for ML Prototypes

The high turnover rate of employees in the field of machine learning (ML) and the fact that methodological knowledge - such as the insights of attempted methods - is often not considered an essential result of ML tasks often lead to the loss of valuable knowledge, especially beyond the actual code. This master thesis investigates the criteria and practices that ML practitioners use in the search, evaluation and reuse of knowledge during ML solution prototyping, focusing on the modeling phase. A qualitative study, consisting of 18 semi-structured interviews with ML practitioners and subsequent thematic analysis, was used to gain key insights. The results show that practitioners primarily search for two types of knowledge: conceptual foundations and existing solutions. The search for knowledge is adaptive, mostly via digital sources, and the evaluation of sources takes place in a two-stage process (relevance check, then content evaluation), influenced by factors such as source credibility, topicality, and community feedback. ML practitioner value quality characteristics such as comprehensibility, clear argumentation and reproducibility. From these findings, a guideline for supporting knowledge reuse was derived, which includes requirements for knowledge management tools as well as recommendations for practitioners themselves. This thesis emphasizes the complexity of knowledge reuse in the ML domain beyond pure code reuse and offers concrete starting points for increasing the efficiency of knowledge reuse in ML solution prototyping.

Project information

Status:

Finished

Thesis for degree:

Master

Student:

Patrick Chrestin

Supervisor:
Part of research project:

SE4ML - Processes, People and Tools

Id:

2025-005