An Improved Fire Detection Algorithm Based on YOLOv8 Integrated with DGIConv, FourBranchAttention and GSIoU

Muxiang Zhang

Abstract


Fire detection is highly important for people's lives and property, and enhancing its accuracy is essential. This study focused on utilizing and improving YOLOv8 to obtain higher detection accuracy for fire detection. Three methods were used. First, the newly designed DGIConv module replaces the original Conv module, thereby decreasing the computational complexity while enhancing the model's performance. Second, to enhance the recognition ability of flame targets, a new attention mechanism named FourBranchAttention was designed, and a comparison was made with other attention mechanisms. The experiments revealed that the newly designed attention mechanism performed best on the mAP50 and mAP50-95 metrics. Finally, to improve the convergence speed and localization ability of the model, the loss function is optimized by adopting better hyperparameters of the TaskAlignedAssigner and employing the newly designed GSIoU as an alternative to the original CIoU. Through ablation experiments, all three improvements improved the detection performance to a certain extent, and the model using the three improvements achieved the best performance. Compared with the baseline, the YOLOv8 model with DGIConv, FourBranchAttention, and the optimized loss function increased the mAP50 by 2.52% and the mAP50-95 by 3.37%. The mAP50 and mAP50-95 had reached 98.46% and 75.26%, respectively. Compared with previous models, such as SSD/YOLOv7, the performance metrics of enhanced YOLOv8 also exhibited significant enhancements, thereby augmenting the accuracy of fire detection.

 

Doi: 10.28991/HIJ-2024-05-03-09

Full Text: PDF


Keywords


YOLOv8; Conv; Attention; Loss; IoU; mAP50.

References


Girshick, R. (2015). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision, 2015 International Conference on Computer Vision, ICCV 2015, 1440–1448. doi:10.1109/ICCV.2015.169.

Jiang, P., Ergu, D., Liu, F., Cai, Y., & Ma, B. (2021). A Review of Yolo Algorithm Developments. Procedia Computer Science, 199, 1066–1073. doi:10.1016/j.procs.2022.01.135.

Diwan, T., Anirudh, G., & Tembhurne, J. V. (2023). Object detection using YOLO: challenges, architectural successors, datasets and applications. Multimedia Tools and Applications, 82(6), 9243–9275. doi:10.1007/s11042-022-13644-y.

Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). SSD: Single shot multibox detector. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Vol. 9905 LNCS, 21–37. doi:10.1007/978-3-319-46448-0_2.

Han, Y., Duan, B., Guan, R., Yang, G., & Zhen, Z. (2024). LUFFD-YOLO: A Lightweight Model for UAV Remote Sensing Forest Fire Detection Based on Attention Mechanism and Multi-Level Feature Fusion. Remote Sensing, 16(12), 2177. doi:10.3390/rs16122177.

Cao, L., Shen, Z., & Xu, S. (2024). Efficient forest fire detection based on an improved YOLO model. Visual Intelligence, 2(1), 20. doi:10.1007/s44267-024-00053-y.

Huang, J., Zhou, J., Yang, H., Liu, Y., & Liu, H. (2023). A Small-Target Forest Fire Smoke Detection Model Based on Deformable Transformer for End-to-End Object Detection. Forests, 14(1), 162. doi:10.3390/f14010162.

Wang, D., Qian, Y., Lu, J., Wang, P., Yang, D., & Yan, T. (2024). EA-YOLO: Efficient Extraction and Aggregation Mechanism of YOLO for Fire Detection, 22. doi:10.21203/rs.3.rs-3930713/v1.

Wei, Z. (2023). Fire Detection of yolov8 Model based on Integrated SE Attention Mechanism. Frontiers in Computing and Intelligent Systems, 4(3), 28–30. doi:10.54097/fcis.v4i3.10765.

Saydirasulovich, S. N., Mukhiddinov, M., Djuraev, O., Abdusalomov, A., & Cho, Y. I. (2023). An Improved Wildfire Smoke Detection Based on YOLOv8 and UAV Images. Sensors (Basel, Switzerland), 23(20), 8374. doi:10.3390/s23208374.

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016-December, 779–788. doi:10.1109/CVPR.2016.91.

Nwankpa, C., Ijomah, W., Gachagan, A., & Marshall, S. (2018). Activation functions: Comparison of trends in practice and research for deep learning. arXiv, Preprint arXiv:1811.03378.

Jiang, C., Ren, H., Ye, X., Zhu, J., Zeng, H., Nan, Y., Sun, M., Ren, X., & Huo, H. (2022). Object detection from UAV thermal infrared images and videos using YOLO models. International Journal of Applied Earth Observation and Geoinformation, 112, 102912. doi:10.1016/j.jag.2022.102912.

Tian, Z., Shen, C., Chen, H., & He, T. (2022). FCOS: A Simple and Strong Anchor-Free Object Detector. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4), 1922–1933. doi:10.1109/TPAMI.2020.3032166.

Zhang, S., Chi, C., Yao, Y., Lei, Z., & Li, S. Z. (2020). Bridging the gap between anchor-based and anchor-free detection via adaptive training sample selection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 9756–9765. doi:10.1109/CVPR42600.2020.00978.

Zhou, Y., Zhu, W., He, Y., & Li, Y. (2023). YOLOv8-based Spatial Target Part Recognition. Proceedings of 2023 IEEE 3rd International Conference on Information Technology, Big Data and Artificial Intelligence, ICIBA 2023, 3, 1684–1687. doi:10.1109/ICIBA56860.2023.10165260.

Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020). Distance-IoU loss: Faster and better learning for bounding box regression. AAAI 2020 - 34th AAAI Conference on Artificial Intelligence, 34(07), 12993–13000. doi:10.1609/aaai.v34i07.6999.

Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings.

Wei, Y., Xiao, H., Shi, H., Jie, Z., Feng, J., & Huang, T. S. (2018). Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi-Supervised Semantic Segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 7268–7277. doi:10.1109/CVPR.2018.00759.

Chen, T., Duan, B., Sun, Q., Zhang, M., Li, G., Geng, H., Zhang, Q., & Yu, B. (2022). An Efficient Sharing Grouped Convolution via Bayesian Learning. IEEE Transactions on Neural Networks and Learning Systems, 33(12), 7367–7379. doi:10.1109/TNNLS.2021.3084900.

Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. A. (2017). Inception-v4, inception-ResNet and the impact of residual connections on learning. 31st AAAI Conference on Artificial Intelligence, AAAI 2017, 31(1), 4278–4284. doi:10.1609/aaai.v31i1.11231.

Nandini, B. (2021). Detection of Skin Cancer using Inception V3 And Inception V4 Convolutional Neural Network (CNN) For Accuracy Improvement. Revista Gestão Inovação e Tecnologias, 11(4), 1138–1148. doi:10.47059/revistageintec.v11i4.2174.

Niu, Z., Zhong, G., & Yu, H. (2021). A review on the attention mechanism of deep learning. Neurocomputing, 452, 48–62. doi:10.1016/j.neucom.2021.03.091.

Misra, D., Nalamada, T., Arasanipalai, A. U., & Hou, Q. (2021). Rotate to attend: Convolutional triplet attention module. Proceedings - 2021 IEEE Winter Conference on Applications of Computer Vision, WACV 2021, 3138–3147. doi:10.1109/WACV48630.2021.00318.

Woo, S., Park, J., Lee, J. Y., & Kweon, I. S. (2018). CBAM: Convolutional block attention module. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 11211 LNCS, 3–19. doi:10.1007/978-3-030-01234-2_1.

Hou, Q., Zhou, D., & Feng, J. (2021). Coordinate attention for efficient mobile network design. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 13708–13717. doi:10.1109/CVPR46437.2021.01350.

Zhang, Q. L., & Yang, Y. Bin. (2021). SA-Net: Shuffle attention for deep convolutional neural networks. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2021-June, 2235–2239. doi:10.1109/ICASSP39728.2021.9414568.

Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., & Hu, Q. (2020). ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 11531–11539. doi:10.1109/CVPR42600.2020.01155.

Luo, M., Xu, L., Yang, Y., Cao, M., & Yang, J. (2022). Laboratory Flame Smoke Detection Based on an Improved YOLOX Algorithm. Applied Sciences (Switzerland), 12(24), 12876. doi:10.3390/app122412876.

Geng, X., Su, Y., Cao, X., Li, H., & Liu, L. (2024). YOLOFM: an improved fire and smoke object detection algorithm based on YOLOv5n. Scientific Reports, 14(1), 4543. doi:10.1038/s41598-024-55232-0.

Yun, B., Zheng, Y., Lin, Z., & Li, T. (2024). FFYOLO: A Lightweight Forest Fire Detection Model Based on YOLOv8. Fire, 7(3), 93. doi:10.3390/fire7030093.


Full Text: PDF

DOI: 10.28991/HIJ-2024-05-03-09

Refbacks

  • There are currently no refbacks.


Copyright (c) 2024 Muxiang Zhang