A Multi-Attribute CNN Framework for Deepfake Forensics
Abstract
The challenge of deepfake detection is severely hampered by a “Generalization Crisis,” where detectors fail when encountering unseen generation techniques or data compression levels due to over-reliance on easily suppressed, domain-specific artifacts. To address this, we propose the Multi-Attribute Con- volutional Neural Network (MA-CNN) framework, specifically designed for advanced deepfake forensics. The MA-CNN utilizes a highly modular, three-stream architecture that achieves simul- taneous and decoupled extraction of three orthogonal forensic attributes: Frequency Residues (AF ), Low-Level Steganalysis Traces (AS ), and Physiological Inconsistencies (AP ). The core innovation is the Dynamic Fusion Module (DFM), which employs Factorized Bilinear Pooling (FBP) and channel-wise attention to capture complex, second-order interdependencies between arti- facts, maximizing the discriminative power. Rigorous evaluation under a demanding cross-dataset generalization protocol (trained on FaceForensics++ and tested on Celeb-DF v2 and DFDC) demonstrates that the MA-CNN achieves superior performance compared to state-of-the-art baselines. This study validates the hypothesis that explicit engineering of multi-attribute feature extractors yields a robust, domain-invariant detection capability, confirmed by interpretability analysis showing the model prior- itizes low-level forensic evidence.
References
Preeti, M. Kumar, and H. K. Sharma, “A gan-based model of deepfake detection in social media,” Procedia Computer Science, vol. 218, pp. 2153–2162, 2023, international Conference on Machine Learning and Data Engineering. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S1877050923001916
Z. Yan, Y. Zhang, Y. Fan, and B. Wu, “Ucf: Uncovering common features for generalizable deepfake detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oc- tober 2023, pp. 22 412–22 423.
B. Dolhansky, J. Bitton, B. Pflaum, J. Lu, R. Howes, M. Wang, and
C. C. Ferrer, “The deepfake detection challenge (dfdc) dataset,” arXiv preprint arXiv:2006.07397, 2020.
H. Zhao, W. Zhou, D. Chen, T. Wei, W. Zhang, and N. Yu, “Multi- attentional deepfake detection,” in Proceedings of the IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), June 2021,
pp. 2185–2194.
T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, “Analyzing and improving the image quality of stylegan,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 8110–8119.
Y.-C. Hsu, Y. Shen, H. Jin, and Z. Kira, “Generalized odin: Detecting out-of-distribution image without learning from out-of-distribution data,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 10 951–10 960.
S. Tariq, S. Lee, and S. Woo, “One detector to rule them all: Towards a general deepfake attack detection framework,” in Proceedings of the web conference 2021, 2021, pp. 3625–3637.
T. Bai, J. Luo, J. Zhao, B. Wen, and Q. Wang, “Recent advances in adversarial training for adversarial robustness,” arXiv preprint arXiv:2102.01356, 2021.
A. Towler, D. White, and R. I. Kemp, “Evaluating the feature compar- ison strategy for forensic face identification.” Journal of Experimental Psychology: Applied, vol. 23, no. 1, p. 47, 2017.
Z.-Q. Zhao, P. Zheng, S.-t. Xu, and X. Wu, “Object detection with deep learning: A review,” IEEE transactions on neural networks and learning systems, vol. 30, no. 11, pp. 3212–3232, 2019.
Refbacks
- There are currently no refbacks.