Home


A tutorial on "Target Class Learning for Anomaly/Outlier Detection: a robust strategy"

Machine learning techniques have advanced exponentially in recent years. These improved technologies are adopted in several application domains. An anomaly is an abnormal pattern that exists in the data and in all real-time applications, anomaly detection is the most crucial task. The anomaly or outlier detection task becomes more challenging when only the target class (class of interest) samples are available during training and other class samples are either ill-defined or absent. In this context, several solutions have been offered, but despite the extensive technological developments, anomaly/novelty detection is still a challenging task and there is enough scope to mimic the learning behaviour of the human brain. Following the capability of the brain to simultaneously analyze the anomaly, one-class classification strategies are adopted for better learning of the target class. Apart from the huge sample space, the high dimension of the data adds computational overhead along with its intrinsic property of curse of dimensionality. The learning models exhibit better performance in the presence of the most promising training samples and discriminating features. The selection of training samples and features must be supervised by only the target class samples to ensure strong separation from outliers. With this motivation, this tutorial presents the recent advancements in target class learning for anomaly detection while covering fundamentals, use-cases, applications, and challenges. The tutorial also discusses future research possibilities and necessary challenges.

Presenters’ detail:

P. Nagabhushan
Affiliation: Vice Chancellor, Vignan's Foundation for Science, Technology & Research (Deemed to be University)
Address: Vadlamudi, Guntur-522213, Andhra Pradesh India
Email: pnbhushan@vignan.ac.in

Sonali Agarwal
Affiliation: Associate Professor, Department of IT, Indian Institute of Information Technology, Allahabad, India
Address: Room No. 5203, CC-III Building, Indian Institute of Information Technology, Allahabad, Deoghat, Jhalwa, Prayagraj-211015, India
Email: sonali@iiita.ac.in

Sanjay Kumar Sonbhadra
Affiliation: Assistant Professor, Department of Computer Science and Engineering, ITER, Shiksha ‘O’ Anusandhan, Bhubaneswar, India
Address: J-15, Khandagiri Marg, Dharam Vihar, Jagamara, Bhubaneswar, Odisha 751030, India
Email: sanjaykumarsonbhadra@soa.ac.in

Narinder Singh Punn
Affiliation: Teaching Research Fellow, Department of IT, Indian Institute of Information Technology, Allahabad, India
Address: Room No. 5241, CC-III Building, Indian Institute of Information Technology, Allahabad, Deoghat, Jhalwa, Prayagraj-211015, India
Email: pse2017002@iiita.ac.in

Biographical sketch:

P Nagabhushan
Prof. P. Nagabhushan is presently working as vice-chancellor of Vignan's Foundation for Science, Technology & Research, (Deemed to be University), AP, India. He worked as Director of Indian Institute of Information Technology Allahabad, Prayagraj (An Institute of National Importance by the Act of Parliament) from May 2017 to March 2022. He also served as Chief Nodal Officer, Dean, and Chairman with various academic administrative responsibilities, and with academic reformation activities at University of Mysore, Mysore. Being the founder Professor of the Department of Studies in Computer Science, University of Mysore, he focused on moulding the department as a learning and research focused department. He was responsible for shaping the department as a centre of excellence in Computer Cognition and Recognition covering the areas of Pattern Recognition, Image Processing, Intelligence and Learning. Earlier to this, he was coordinator of M.Tech. program in SJ College of Engineering, Mysore. Prof. P. Nagabhushan remained enthusiastically active in implementing continuous learning, continuous assessment, and choice based credits earning since 2000 at department level and since 2010 at University of Mysore, and since 2017 at Indian Institute of Information Technology Allahabad, which are now being promoted by NEP2020. He has supervised 33 Ph.D. scholars and has authored more than 200 Journal papers totalling more than 500 research papers. He was an invited academician and researcher at USA, JAPAN, FRANCE, SUDAN. He was the Investigator of several research projects funded by UGC, MHRD, AICTE, ICMR, ISRO, IFCAR, DRDO and MHA. He has received many awards for his academic roles, and he is the recipient of fellowships from Institute of Engineers (FIE), Institute of Electronics and Telecommunication Engineers (FIETE), and International Academy of Physical Sciences (FIAPS). His Google Scholar Citation is 2838 with H-Index 29 and i10-Index 85.

Sonali Agarwal
Dr. Sonali Agarwal is working as an Associate Professor in the Information Technology Department of Indian Institute of Information Technology (IIIT), Allahabad, India. She received her Ph. D. Degree at IIIT Allahabad and joined as faculty at IIIT Allahabad, where she has been teaching since October 2009. She holds Bachelor of Engineering (B.E.) degree in Electrical Engineering from Bhilai Institute of Technology, Bhilai, (C.G.) India and Masters of Engineering (M.E.) degree in Computer Science from Motilal Nehru National Institute of Technology (MNNIT), Allahabad, India Her main research interests are in the areas of Artificial Intelligence and Big Data. She is the head of Big Data Analytics Lab at IIIT Allahabad, India.

Sanjay Kumar Sonbhadra
Dr. Sanjay Kumar Sonbhadra is presently working as Assistant Professor in the Computer Science and Engineering Department of ITER, Shiksha ‘O’ Anusandhan, Bhubaneswar, Odisha, India. He is mainly working on One-class classification, Anomaly detection, Target class guided dimensionality reduction and training sample selection techniques and Big data analytics. During 2017-2021, he worked as a senior member of “Big Data Analytics Lab'' at IIIT Allahabad, India. He is having hands-on experience on real-time dashboard solutions for stream data analytics using Apache Flink, Elasticsearch and Kibana. He has published many papers in the area of dimensionality reduction and anomaly detection using one class classification approaches.

Narinder Singh Punn
Dr. Narinder Singh Punn is working as a Teaching Research Fellow (TRF) in the Information Technology Department of the Indian Institute of Information Technology (IIIT) Allahabad, India. He received his Ph.D. in 2022 at IIIT Allahabad. His main research includes Medical Imaging segmentation, Deep learning and Artificial Intelligence techniques in healthcare. He is a senior member of the Big Data Analytics Lab at IIIT Allahabad. His recent publications cover applications of deep learning in the detection and prevention of COVID-19, while also exploiting the potential of self-supervised learning in healthcare.


Tutorial description:
An anomaly is an abnormal pattern that exists in the data and in all real-time applications, anomaly detection is the most important and crucial task. The anomaly or outlier detection task becomes more challenging when only the target class (class of interest) samples are available during training and other class samples are either ill-defined or absent. Conceptually, an anomaly or outlier detection task can be considered a one-class classification (OCC) problem that is disparate from conventional binary or multi-class classification problems, because in OCC tasks only the target class (also known as class of interest (CoI)) samples are available for training, whereas the other class samples are totally absent or ill defined. The process of learning the intrinsic properties of the target class is termed target specific mining.

Due to computational and technological advancements in present era, the high dimensional massive data is being generated by geographically distributed sources, where for target-specific mining or one-class classification, the most promising and discriminating training samples and features must be identified to satisfy the following objectives: a) to maximize the learning ability and b) to minimize the false predictions. Like other machine learning approaches, the quality and amount of the training samples and features play a vital role on the performance of OCC algorithms. In this context, the training sample reduction/selection (SR) and dimensionality reduction (DR) are two widely applied data reduction techniques in the field of machine learning and pattern recognition to ensure the strong separation of class of interest samples from non-target class. The presence of high dimensional big volume data in distributed environment makes the SR and DR tasks very challenging, whereas the temporal nature of data makes it more critical. Though, several SR and DR techniques have been proposed concerning conventional binary/multi-class classifiers, but towards target class learning very limited research have been reported till date.

Anomaly or outlier detection tasks are necessary in numerous applications such as document classification, disease diagnosis, fraud and intrusion detection, and novelty detection. Although several OCSVC algorithms have been proposed for anomaly/ novelty detection for batch data, but not been well explored for distributed or online environments. Real-world applications such as earth science, weather forecasting, satellite and aviation control, social networking, etc., continuously generate samples/features over the time-period. Information extraction from such complex streaming data is always a challenging task because of geographically distributed and heterogeneous data sources. Concerning distributed streaming data, centralized processing may lead to communication and computation overhead; therefore, incremental learning algorithms are needed. Following the above discussion, the present tutorial aims to enlighten young minds and other participants about the need for target class learning.

Tutorial outline:
The outline of the tutorial is described below:

Introduction

One-class classification models [Target class learning]

Target class guided data reduction for target class mining

Expected target audience:
Researchers in machine learning and pattern recognition are expected. The number of expected attendees may be in the range 40-60.

References:

  1. Alam, S., Sonbhadra, S. K., Agarwal, S., Nagabhushan, P., & Tanveer, M. (2020). Sample reduction using farthest boundary point estimation (FBPE) for support vector data description (SVDD). Pattern Recognition Letters, 131, 268-276.
  2. Alam, S., Sonbhadra, S. K., Agarwal, S., & Nagabhushan, P. (2020). One-class support vector classifiers: A survey. Knowledge-Based Systems, 196, 105754.
  3. Sonbhadra, S. K., Agarwal, S., & Nagabhushan, P. (2020). Target specific mining of COVID-19 scholarly articles using one-class approach. Chaos, Solitons & Fractals, 140, 110155.
  4. Sonbhadra, S. K., Agarwal, S., & Nagabhushan, P. (2021). Learning Target Class Feature Subspace (LTC-FS) Using Eigenspace Analysis and N-ary Search-Based Autonomous Hyperparameter Tuning for OCSVM. International Journal of Pattern Recognition and Artificial Intelligence, 35(13), 2151015.
  5. Sonbhadra, S. K., Agarwal, S., & Nagabhushan, P. (2020). Early-stage covid-19 diagnosis in presence of limited posteroanterior chest x-ray images via novel pinball-OCSVM. arXiv e-prints, arXiv-2010.
  6. Sonbhadra, S. K., Agarwal, S., & Nagabhushan, P. (2021, July). Target Class Supervised Sample Length and Training Sample Reduction of Univariate Time Series. In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (pp. 603-614). Springer, Cham.
  7. Nagabhushan, P., Sonbhadra, S. K., Punn, N. S., & Agarwal, S. (2021, December). Towards Machine Learning to Machine Wisdom: A Potential Quest. In International Conference on Big Data Analytics (pp. 261-275). Springer, Cham.
  8. Punn, N. S., & Agarwal, S. (2022). CHS-Net: A Deep Learning Approach for Hierarchical Segmentation of COVID-19 via CT Images. Neural Processing Letters, 1-22.
  9. Punn, N. S., & Agarwal, S. (2021). Automated diagnosis of COVID-19 with limited posteroanterior chest X-ray images using fine-tuned deep neural networks. Applied Intelligence, 51(5), 2689-2702.
  10. Agarwal, S., Punn, N. S., Sonbhadra, S. K., Tanveer, M., Nagabhushan, P., Pandian, K. K., & Saxena, P. (2020). Unleashing the power of disruptive and emerging technologies amid COVID-19: A detailed review. arXiv preprint arXiv:2005.11507
  11. Nasalwai, N., Punn, N. S., Sonbhadra, S. K., & Agarwal, S. (2021, May). Addressing the Class Imbalance Problem in Medical Image Segmentation via Accelerated Tversky Loss Function. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 390-402). Springer, Cham.
  12. Punn, N. S., & Agarwal, S. (2021). BT-Unet: A self-supervised learning framework for biomedical image segmentation using Barlow Twins with U-Net models. arXiv preprint arXiv:2112.03916.
  13. Rajput, G., Punn, N. S., Sonbhadra, S. K., & Agarwal, S. (2021, December). Hate speech detection using static BERT embeddings. In International Conference on Big Data Analytics (pp. 67-77). Springer, Cham.
  14. Punn, N. S., & Agarwal, S. (2021). RCA-IUnet: A residual cross-spatial attention guided inception U-Net model for tumor segmentation in breast ultrasound imaging. arXiv preprint arXiv:2108.02508.