Asphalt Pavement Distress Detection by Transfer Learning with Multi-head Attention Technique
DOI:
https://doi.org/10.32985/ijeces.16.2.7Keywords:
computer vision, transfer learning, asphalt distress detection, Multi-head attentionAbstract
Roads and highways represent a crucial lifeline between communities in all countries. They have to be healthy enough for safe and effective transportation. The traditional ways of inspecting roads by human inspectors consume time, and the inspection results may be subjective. For this reason, researchers are motivated to automate pavement distress detection to help the road monitoring and maintenance process. Additionally, many researchers have tried to present models to detect distress on road infrastructure. However, these models face accuracy challenges and overfitting because of the nature and complications of distress images. This paper proposes a model that combines pre-trained VGG16 with a multi-head attention layer. The proposed paradigm began with smoothing as a pre-processing step to eliminate the granular effect of the asphalt gravel and make asphalt damage more distinct. Then, data augmentation was conducted to improve model generalization by adding various distress scenes to the dataset in geometric, color, and intensity cases. This work also contributes to the broader body of research by collecting a local dataset that contains three types of asphalt distress (cracks, potholes, and ruts). The proposed model was tested using three benchmarked datasets in addition to the locally collected one, and it showed efficiency in detecting asphalt distress using offline and real-time images. The model achieved an accuracy 1.00 in the Pavmentscapes dataset, outperforming the UNET model, and a fully connected network was trialed with the same dataset. With the Deep Crack dataset, our model scored an accuracy of 1.00. In contrast, ResNet achieved an accuracy of 0.72 on the same dataset. The NHA12D dataset was also used to test the proposed model and achieved an accuracy of 1.00, but the VGG16 without an attention layer used on that dataset scored only 0.64. All previous obvious tests prove that the proposed VGG16 and multi-head attention paradigm outperform the earlier models. Additionally, the proposed model has undergone a real-time test on local roads. The future directions are to try to make the self-attention mechanism more explainable and implement an attention layer for multi-scales.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 International Journal of Electrical and Computer Engineering Systems

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.