We consider the problem of anomaly detection with a small set of partially
labeled anomaly examples and a large-scale unlabeled dataset. This is a common
scenario in many important applications. Existing related methods either
exclusively fit the limited anomaly examples that typically do not span the
entire set of anomalies, or proceed with unsupervised learning from the
unlabeled data. We propose here instead a deep reinforcement learning-based
approach that enables an end-to-end optimization of the detection of both
labeled and unlabeled anomalies. This approach learns the known abnormality by
automatically interacting with an anomaly-biased simulation environment, while
continuously extending the learned abnormality to novel classes of anomaly
(i.e., unknown anomalies) by actively exploring possible anomalies in the
unlabeled data. This is achieved by jointly optimizing the exploitation of the
small labeled anomaly data and the exploration of the rare unlabeled anomalies.
Extensive experiments on 48 real-world datasets show that our model
significantly outperforms five state-of-the-art competing methods.

By admin