Machine Learning Approaches to Advanced Outlier Detection in Psychological Datasets

Authors

  • Khoula Al. Abri Universiti Tenaga Nasional Department of College of Computing and Informatics Kajang, Selangor Malaysia
  • Manjit Singh Sidhu Universiti Tenaga Nasional Department of College of Computing and Informatics Kajang, Selangor Malaysia

DOI:

https://doi.org/10.32985/ijeces.15.1.2

Keywords:

Outlier Detection, psychological dataset, machine learning techniques, ensemble methods

Abstract

The core aim of this study is to determine the most effective outlier detection methodologies for multivariate psychological datasets, particularly those derived from Omani students. Due to their complex nature, such datasets demand robust analytical methods. To this end, we employed three sophisticated algorithms: local outlier factor (LOF), one-class support vector machine (OCSVM), and isolation forest (IF). Our initial findings showed 155 outliers by both LOF and IF and 147 by OCSVM. A deeper analysis revealed that LOF detected 55 unique outliers based on differences in local density, OCSVM isolated 44 unique outliers utilizing its transformed feature space, and IF identified 76 unique outliers leveraging its tree-based mechanics. Despite these varying results, all methods had a consensus for just 44 outliers. Employing ensemble techniques, both averaging and voting methods identified 155 outliers, whereas the weighted method highlighted 151, with a consensus of 150 outliers across the board. In conclusion, while individual algorithms provide distinct perspectives, ensemble techniques enhance the accuracy and consistency of outlier detection. This underscores the necessity of using multiple algorithms with ensemble techniques in analyzing psychological datasets, facilitating a richer comprehension of inherent data structures.

Downloads

Published

2024-01-08

Issue

Section

Original Scientific Papers