Downloads: 0
Research Paper | Computer Science and Engineering | Volume 15 Issue 4, April 2026 | Pages: 417 - 420 | India
Hybrid Deep Learning-Based Deepfake Video Detection Using Spatial-Temporal Modeling and Attention Mechanisms
Abstract: This study addresses the growing challenge of detecting deepfake videos by proposing a face-centered hybrid deep learning framework for reliable video-level classification. The system integrates a pretrained EfficientNet-B0 model for spatial feature extraction with lightweight 3D convolutional layers for temporal modeling, enabling efficient detection without full 3D CNN complexity. Facial regions are isolated using an OpenCV-based detector, and three attention mechanisms, namely temporal, channel, and spatial attention, enhance feature discrimination. The model is deployed as a FastAPI service for real-world applicability. Experimental evaluation on the DFDC-P dataset demonstrates strong performance, achieving 91.4% accuracy, an AUC-ROC of 0.964, and an F1-score of 0.911. The results confirm that combined spatial-temporal learning improves robustness in detecting subtle manipulation artifacts, supporting practical deployment in forensic and content moderation systems.
Keywords: Deepfake Detection, Deep Learning, EfficientNet-B0, Temporal Modeling, Attention Mechanism, Video Forensics, FastAPI, Computer Vision, Video Classification, Artificial Intelligence Security
How to Cite?: Nagaraj Moger, Smruthi Y Rao, Pragathi Shetty, A Madan, "Hybrid Deep Learning-Based Deepfake Video Detection Using Spatial-Temporal Modeling and Attention Mechanisms", Volume 15 Issue 4, April 2026, International Journal of Science and Research (IJSR), Pages: 417-420, https://www.ijsr.net/getabstract.php?paperid=SR26327224028, DOI: https://dx.dx.doi.org/10.21275/SR26327224028