Software Defect Prediction using Deep Learning by Correlation Clustering of Testing Metrics

Kamal Kant Sharma; Amit  Sinha; Arun  Sharma

doi:10.32985/ijeces.13.10.15

Authors

Kamal Kant Sharma Department of Information Technology, KIET Group of Institutions, Delhi-NCR, Ghaziabad Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
Amit Sinha Department of Information Technology, ABES Engineering College, Ghaziabad Dr. A.P.J. Abdul Kalam Technical University, Lucknow, India
Arun Sharma Department of AI and Data Sciences Indira Gandhi Delhi Technical University for Women, Delhi, India

DOI:

https://doi.org/10.32985/ijeces.13.10.15

Keywords:

Software Engineering, Software Testing, Abstract Syntax Tree, Machine Learning, Convolution Neural Network

Abstract

The software industry has made significant efforts in recent years to enhance software quality in businesses. The use of proactively defect prediction in the software will assist programmers and white box testing in detecting issues early, saving time and money. Conventional software defect prediction methods focus on traditional source code metrics such as code complexities, lines of code, and so on. These capabilities, unfortunately, are unable to retrieve the semantics of source code. In this paper, we have presented a novel Correlation Clustering fine-tuned CNN (CCFT-CNN) model based on testing Metrics. CCFT-CNN can predict the regions of source code that contain faults, errors, and bugs. Abstract Syntax Tree (AST) tokens are extracted as testing Metrics vectors from the source code. The correlation among AST testing Metrics is performed and clustered as a more relevant feature vector and fed into Convolutional Neural Network (CNN). Then, to enhance the accuracy of defect prediction, fine-tuning of the CNN model is performed by applying hyperparameters. The result analysis is performed on the PROMISE dataset that contains samples of open-source Java applications such as Camel Dataset, Jedit dataset, Poi dataset, Synapse dataset, Xerces dataset, and Xalan dataset. The result findings show that the CCFT- CNN model increases the average F-measure by 2% when compared to the baseline model.

Software Defect Prediction using Deep Learning by Correlation Clustering of Testing Metrics

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Make a Submission

JCR Impact factor for 2024

0.9