• KSII Transactions on Internet and Information Systems
    Monthly Online Journal (eISSN: 1976-7277)

HAT-YOLO: Hybrid Attention and Transformer-Integrated Model for UAV Object Detection on Embedded Devices


Abstract

The rapid advancement of unmanned aerial vehicles (UAVs) has increased the demand of accurate, lightweight and real-time object detection models suitable for embedded deployment. However, target object detection from remote sensing images is challenging due to scale variation, complex backgrounds and high object density. Additionally, many existing deep learning-based models require large GPU memory. To overcome these limitations, we proposed a HAT-YOLO model, an improved YOLOv8n-based architecture for efficient vehicle detection in aerial images using limited memory devices. The proposed HAT-YOLO model introduces three key modules. Firstly, a dual channel-spatial attention based A2C2 module to improve local feature discrimination. Secondly, lightweight transformer based TA2C2 module for better long-range global context feature extraction. Finally, GELU activation function integrated CBG module for faster convergence, non-linear feature representation and lower processing time. The HAT-YOLO model was trained and evaluated on VEDAI and RSOD datasets using NVIDIA RTX A2000, Google Colab and Jetson Nano GPU platforms. Experimental results demonstrate that HAT‑YOLO improved accuracy over YOLOv8n by 6.7% and 4.2% in mAP@0.5 and 6.1% and 3.2% in mAP@0.5:0.95 on VEDAI and RSOD datasets, respectively, while maintaining lightweight architecture with 3.82 million parameters and 8.9 GFLOPs. On Jetson Nano, HAT-YOLO achieves 17.2 FPS on VEDAI and 18.4 FPS on RSOD, indicating real-time detection performance on resource-constrained devices. These results show that our developed model effectively balances detection accuracy, architectural complexity and inference speed, making it highly feasible for deployment on embedded platforms. Proposed model https://github.com/mdminhazulhaq/HAT-YOLO


Statistics

Show / Hide Statistics

Statistics (Cumulative Counts from December 1st, 2015)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article

[IEEE Style]
M. M. Haq, A. S. M. Khairuddin, E. Hanafi, H. M. F. Noman, M. H. Junos, "HAT-YOLO: Hybrid Attention and Transformer-Integrated Model for UAV Object Detection on Embedded Devices," KSII Transactions on Internet and Information Systems, vol. 20, no. 2, pp. 920-941, 2026. DOI: 10.3837/tiis.2026.02.014.

[ACM Style]
Md. Minhazul Haq, Anis Salwa Mohd Khairuddin, Effariza Hanafi, Hafiz Muhammad Fahad Noman, and Mohamad Haniff Junos. 2026. HAT-YOLO: Hybrid Attention and Transformer-Integrated Model for UAV Object Detection on Embedded Devices. KSII Transactions on Internet and Information Systems, 20, 2, (2026), 920-941. DOI: 10.3837/tiis.2026.02.014.

[BibTeX Style]
@article{tiis:105901, title="HAT-YOLO: Hybrid Attention and Transformer-Integrated Model for UAV Object Detection on Embedded Devices", author="Md. Minhazul Haq and Anis Salwa Mohd Khairuddin and Effariza Hanafi and Hafiz Muhammad Fahad Noman and Mohamad Haniff Junos and ", journal="KSII Transactions on Internet and Information Systems", DOI={10.3837/tiis.2026.02.014}, volume={20}, number={2}, year="2026", month={February}, pages={920-941}}