A Novel Framework for Context-aware Outlier Detection in Big Data Streams-

TitleA Novel Framework for Context-aware Outlier Detection in Big Data Streams-
Publication TypeJournal Article
Year of Publication2018
AuthorsAhmad, H, Dowaji, S
JournalJournal of Digital Information Management
Volume16
Issue5
Start Page213
Pagination213-222
Date Published10/2018
Type of ArticleResearch
Abstract

Outlier and anomaly detection has always been a critical problem in many fields. Although it has been investigated deeply in data mining, the problem has become more difficult and critical in the Big Data era since the volume, velocity and variety of data change drastically with rather complicated types of outliers. In such an environment, where real-time outlier detection and analysis over data streams is a necessity, the existing solutions are no longer effective and sufficient. While many existing algorithms and approaches consider the content of the data stream, there are few approaches which consider the context and conditions in which the content has been produced. In this paper, we propose a novel framework for contextual outlier detection in big data streams which inject the contextual attributes in the stream content as a primary input for outlier detection rather than using the stream content alone or applying the contextual detection on content anomalies only. The detection algorithm incorporates two approaches; the first, a supervised detection method and the other, an unsupervised, which allows the detection process to adapt to the normal change in the stream behavior over time. The detected outliers are either both content and contextual outliers or contextual outliers only. The proposed contextual detection approach prunes the false positive outliers and detects the true negative outliers at the same time. Moreover, in this framework, the detection engine preserves both outliers and context values in which those outliers were detected to be used in the engine self-training and in outliers modeling in order to enhance the outlier prediction accuracy.

URLhttp://dline.info/fpaper/jdim/v16i5/jdimv16i5_1.pdf
DOI10.6025/jdim/2018/16/5/213-222
Refereed DesignationRefereed

Collaborative Partner

Institute of Electronic and Information Technology (IEIT)

Collaborative Partner

Collaborative Partner