Deep Learning-Based Surface Defect Detection Technology for Microdevices
Downloads
Surface defect detection on microdevices is challenged by extremely small defect sizes, complex background interference, and strict requirements on both detection accuracy and computational efficiency. The objective of this study is to develop a lightweight yet high-precision detection framework suitable for resource-constrained industrial deployment. To this end, this paper proposes LiteKANformer, a multi-module lightweight Transformer-based architecture. Based on LiteKANformer, a novel detection framework named LKF-YOLO is constructed by embedding global contextual modeling into the backbone and optimizing multi-scale feature fusion in the neck network. Experimental analysis is conducted on a PCB surface defect dataset and a semiconductor chip defect dataset. Compared with the C3TR module, LiteKANformer achieves comparable detection accuracy while reducing the parameter count by approximately 3.1% and improving the inference frame rate by 2.3%. Furthermore, the proposed LKF-YOLO framework outperforms other mainstream detection models on the PCB dataset in terms of accuracy, recall, and real-time performance. The main novelty of this work lies in the co-design of activation representation, normalization strategy, and computation primitives within a unified lightweight Transformer block, providing an effective solution that balances detection precision and deployment efficiency for microdevice surface defect detection.
Downloads
[1] Ling, Q., & Isa, N. A. M. (2023). Printed Circuit Board Defect Detection Methods Based on Image Processing, Machine Learning and Deep Learning: A Survey. IEEE Access, 11, 15921–15944. doi:10.1109/ACCESS.2023.3245093.
[2] Bhardwaj, R. (2023). Semiconductor Wafer Defect Detection using Deep Learning. PriMera Scientific Engineering, 4, 3–13. doi:10.56831/psen-04-097.
[3] Zhao, Z., Wang, J., Tao, Q., Li, A., & Chen, Y. (2024). An unknown wafer surface defect detection approach based on Incremental Learning for reliability analysis. Reliability Engineering and System Safety, 244, 109966. doi:10.1016/j.ress.2024.109966.
[4] Gao, L., Zhang, J., Yang, C., & Zhou, Y. (2022). Cas-VSwin transformer: A variant Swin transformer for surface-defect detection. Computers in Industry, 140, 103689. doi:10.1016/j.compind.2022.103689.
[5] Ma, J., & Cheng, X. (2023). Fast segmentation algorithm of PCB image using 2D OTSU improved by adaptive genetic algorithm and integral image. Journal of Real-Time Image Processing, 20(1), 10. doi:10.1007/s11554-023-01272-0.
[6] Jiang, X., Guo, K., Lu, Y., Yan, F., Liu, H., Cao, J., ... & Tao, D. (2023). CINFormer: Transformer network with multi-stage CNN feature injection for surface defect segmentation. arXiv preprint arXiv:2309.12639. doi:10.48550/arXiv.2309.12639.
[7] Singh, S. A., Kumar, A. S., & Desai, K. A. (2023). Comparative assessment of common pre-trained CNNs for vision-based surface defect detection of machined components. Expert Systems with Applications, 218, 119623. doi:10.1016/j.eswa.2023.119623.
[8] Cao, Y., Pang, D., Zhao, Q., Yan, Y., Jiang, Y., Tian, C., Wang, F., & Li, J. (2024). Improved YOLOv8-GD deep learning model for defect detection in electroluminescence images of solar photovoltaic modules. Engineering Applications of Artificial Intelligence, 131, 107866. doi:10.1016/j.engappai.2024.107866.
[9] Tian, J. H., Feng, X. F., Li, F., Xian, Q. L., Jia, Z. H., & Liu, J. L. (2025). An improved YOLOv5n algorithm for detecting surface defects in industrial components. Scientific Reports, 15(1), 9756. doi:10.1038/s41598-025-94109-8.
[10] Ge, Y., Li, Z., & Meng, L. (2025). YOLO-MSD: a robust industrial surface defect detection model via multi-scale feature fusion. Applied Intelligence, 55(12), 1–18. doi:10.1007/s10489-025-06739-0.
[11] Ma, R., Chen, J., Feng, Y., Zhou, Z., & Xie, J. (2025). ELA-YOLO: An efficient method with linear attention for steel surface defect detection during manufacturing. Advanced Engineering Informatics, 65, 103377. doi:10.1016/j.aei.2025.103377.
[12] Shunmugam, R., Yogarayan, S., Abdul Razak, S. F., & Sayeed, M. S. (2025). IMpc-PyrYOLO: Hybrid YOLO Based Feature Pyramidal Network for Pest Detection in Rice Leaves. Emerging Science Journal, 9(3), 1731–1748. doi:10.28991/ESJ-2025-09-03-029.
[13] Khan, A., Rauf, Z., Sohail, A., Khan, A. R., Asif, H., Asif, A., & Farooq, U. (2023). A survey of the vision transformers and their CNN-transformer based variants. Artificial Intelligence Review, 56(Suppl 3), 2917–2970. doi:10.1007/s10462-023-10595-0.
[14] Yao, W., Bai, J., Liao, W., Chen, Y., Liu, M., & Xie, Y. (2024). From CNN to Transformer: A Review of Medical Image Segmentation Models. Journal of Imaging Informatics in Medicine, 37(4), 1529–1547. doi:10.1007/s10278-024-00981-7.
[15] Yao, T., Li, Y., Pan, Y., Wang, Y., Zhang, X. P., & Mei, T. (2023). Dual Vision Transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(9), 10870–10882. doi:10.1109/TPAMI.2023.3268446.
[16] Zhang, T., Xu, W., Luo, B., & Wang, G. (2025). Depth-Wise Convolutions in Vision Transformers for efficient training on small datasets. Neurocomputing, 617, 128998. doi:10.1016/j.neucom.2024.128998.
[17] Somvanshi, S., Javed, S. A., Islam, M. M., Pandit, D., & Das, S. (2025). A survey on kolmogorov-arnold network. ACM Computing Surveys, 58(2), 1-35. doi:10.1145/3743128.
[18] Hussain, M. (2024). YOLOv1 to v8: Unveiling Each Variant-A Comprehensive Review of YOLO. IEEE Access, 12, 42816–42833. doi:10.1109/ACCESS.2024.3378568.
[19] Xu, J., Sun, X., Zhang, Z., Zhao, G., & Lin, J. (2019). Understanding and improving layer normalization. Advances in Neural Information Processing Systems, 32.
[20] Frick, T., Rigotti, M., Antognini, D. M., Giurgiu, I., & Malossi, A. C. I. (2025). Layer normalization for calibrated uncertainty in deep learning (U.S. Patent Application Publication No. US20250094784A1). U.S. Patent and Trademark Office, Washington, D.C., United States.
[21] Zhu, J., Chen, X., He, K., LeCun, Y., & Liu, Z. (2025). Transformers without Normalization. Proceedings of the Computer Vision and Pattern Recognition Conference 2025, 14901–14911. doi:10.1109/cvpr52734.2025.01388.
[22] Talha, S., Akhssas, A., Aarab, A., Aabi, A., Berkat, B., & Amouch, S. (2025). Robust Ensemble Machine Learning for Flash Flood Susceptibility Mapping Across Semiarid Regions. Civil Engineering Journal, 11(12), 4926–4959. doi:10.28991/CEJ-2025-011-12-02.
[23] Dey, S., Goswami, M., Sethi, J., & Pattnaik, P. K. (2025). Hyb-KAN ViT: Hybrid Kolmogorov-Arnold Networks Augmented Vision Transformer. arXiv preprint arXiv:2505.04740. doi:10.48550/arXiv.2505.04740.
[24] Kosson, A., & Jaggi, M. (2023). Multiplication-Free Transformer Training via Piecewise Affine Operations. Advances in Neural Information Processing Systems, 36.
[25] Guo, Y., Chen, Y., Liu, X., Peng, W., Zhang, Y., Huang, X., & Ma, Z. (2024). Ternary Spike: Learning Ternary Spikes for Spiking Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 38(11), 12244–12252. doi:10.1609/aaai.v38i11.29114.
[26] Zhu, R. J., Zhang, Y., Abreu, S., Sifferman, E., Sheaves, T., Wang, Y., ... & Eshraghian, J. K. (2024). Scalable matmul-free language modeling. arXiv preprint arXiv:2406.02528. doi:10.48550/arXiv.2406.02528.
[27] Terven, J., Córdova-Esparza, D. M., & Romero-González, J. A. (2023). A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Machine Learning and Knowledge Extraction, 5(4), 1680–1716. doi:10.3390/make5040083.
[28] Xiao, R., Wang, H., Wang, L., & Yuan, H. (2025). C3Ghost and C3k2: performance study of feature extraction module for small target detection in YOLOv11 remote sensing images. Second International Conference on Big Data, Computational Intelligence, and Applications (BDCIA 2024, 13550, 139. doi:10.1117/12.3059792.
[29] Lv, M., & Su, W. H. (2023). YOLOV5-CBAM-C3TR: an optimized model based on transformer module and attention mechanism for apple leaf disease detection. Frontiers in Plant Science, 14, 1323301. doi:10.3389/fpls.2023.1323301.
[30] Agac, S., & Durmaz Incel, O. (2023). On the Use of a Convolutional Block Attention Module in Deep Learning-Based Human Activity Recognition with Motion Sensors. Diagnostics, 13(11), 1861. doi:10.3390/diagnostics13111861.
[31] Huang, W., Wei, P., Zhang, M., & Liu, H. (2020). HRIPCB: a challenging dataset for PCB defects detection and classification. The Journal of Engineering, 2020(13), 303–309. doi:10.1049/joe.2019.1183.
[32] Ke, H., Li, H., Wang, B., Tang, Q., Lee, Y. H., & Yang, C. F. (2024). Integrations of LabelImg, You Only Look Once (YOLO), and Open Source Computer Vision Library (OpenCV) for Chicken Open Mouth Detection. Sensors & Materials, 36(11), 4903-4913.
[33] Niu, Y., & Yin, J. (2024). PA-Net: Trustworthy weakly supervised point cloud semantic segmentation with primary–auxiliary structure. Computers and Electrical Engineering, 119, 109555. doi:10.1016/j.compeleceng.2024.109555.
[34] Jocher, G., Chaurasia, A., & Qiu, J. (2023). Ultralytics [Computer software]. GitHub. Available online: https://github.com/ultralytics/ultralytics (accessed on May 2026).
[35] Wang, S., & Zhang, Y. (2023). Grad-CAM: Understanding AI Models. Computers, Materials and Continua, 76(2), 1321–1324. doi:10.32604/cmc.2023.041419.
- This work (including HTML and PDF Files) is licensed under a Creative Commons Attribution 4.0 International License.





















