A framework for integration of real-time anomaly detection algorithms in event-driven software systems
Problem
Anomaly detection algorithms are mostly designed by data scientists or mathematicians, with only basic knowledge of the principles of software engineering. Therefore, the focus in the design of these algorithms lays on increasing precision or recall. Attributes like fault tolerance, recoverability, maintainability, and availability are usually not the focus. Furthermore, the algorithms are written and published in the programming language that the data scientist prefers.
Goal
The implementation complexity of anomaly detection algorithms for timeseries streaming should be reduced by constructing a framework that provides data scientists an easy way to integrate their algorithms and rules. The framework should be evaluated using 3 existing open-source anomaly detection algorithms.
Tasks
- Requirement Analysis: e.g. identification of deployment strategies for different classes of anomaly algorithms, quality requirements for framework, architectural requirements for framework
- Architecture Design: based on requirement analysis, e.g. management of anomaly detection scores, management of anomaly detection models
- Prototype Implementation including 3 different anomaly detection algorithms
Involved Technologies
- Required: Java
- Optional: Python, Kafka (Consumer, Producer, Processor)
Literature
- Golmohammadi, S. K. (2016). Time series contextual anomaly detection for detecting stock market manipulation. Doctoral dissertation, University of Alberta.
- Ted Dunning and Otmar Ertl. (2019). Computing Extremely Accurate Quantiles Using t-Digests (Implementation: https://github.com/tdunning/t-digest)
- Ahmad, Subutai & Lavin, Alexander & Purdy, Scott & Agha, Zuha. (2017). Unsupervised real-time anomaly detection for streaming data. (Implementation: https://github.com/smirmik/CAD)
Supervisor
Project information
Finished
Master
Chunxia Yang
2021-009