An Assistive Tool to Convert Jupyter Notebook to Structured Repository

Manually bridging the gap between Machine learning experimentation and operational software is difficult and time-consuming. Therefore, this thesis explores the concept and implementation of an assistive tool designed to facilitate the easier transition of experimental code from the Jupyter notebooks to production-ready, structured repositories. While identifying the challenges occurring in the manual conversion processes, such as error susceptibility, time consumption and the absence of effective software engineering workflows — the research presents with a novel implementation of such tool that is based on utilizing Notebook’s cell labeling, script generation and storing them in an organized structured directory. Through the automation of multiple vital steps involved in the conversion process, the tool tries to provide a solution that can cause significant decrease in manual labor and providing ease for future collaboration use through integration of version control systems. Additionally, the study presents the concept of syncing changes between experimental and production code, offering a solution that minimizes redundancy and promotes error-resistant software engineering practices. Furthermore, the basic idea for an additional, significant feature of code tracing is discussed.

Project information

Status:

Finished

Thesis for degree:

Master

Student:

Anum Dastgir

Supervisor:
Part of research project:

SE4ML - Processes, People and Tools

Id:

2023-032