This work addresses the problem of code reusability within the Jupyter Lab, a popular environment for data scientists and ML engineers. Utilizing the initial recommendation workflow proposed by [Rah23] to enhance code reuability within Jupyter environement as a foundation, we analyze its limitations in performance and accuracy. We present novel algorithmic approaches, such as an activity-based labeling algorithm and a state-of-the-art hybrid search mechanism incorporating vector database, to enhance code reusability and deliver highly relevant code suggestions. We implement these advanced methods into a Jupyter Lab extension, named JupyterRecSys, which provides a user-friendly GUI embedded within Jupyter Lab’s existing interface. A comprehensive complexity analysis showcases the system’s efficiency and scalability, significantly contributing to the ease of code reuse within the Jupyter environment.
Project information
Finished
Bachelor
Ouyu Xu
2023-027