Published

2025-07-01

Adversarial Image Detection Based on Bayesian Neural Layers

Detección de imágenes adversarias basada en capas neuronales Bayesianas

DOI:

https://doi.org/10.15446/rce.v48n2.116416

Keywords:

Adversarial Attacks, Bayesian Neural Networks (en)
Ataques adversarios, Redes Neuronales Bayesianas (es)

Downloads

Authors

  • Andrés Mora Valencia Universidad de los Andes
  • Héctor J. Hortúa Universidad El Bosque
  • Diego Nicolás Ávila Moreno Universidad El Bosque

Although Deep Neural Networks (DNNs) have repeatedly shown excellent performance, they are known to be vulnerable to adversarial attacks that contain human-imperceptible perturbations. Multiple adversarial defense strategies have been proposed to alleviate this issue, but they often demonstrate restricted practicability regarding efficiency and handling solely specific attacks. In this paper, we analyze the performance of Bayesian Neural Networks (BNNs) endowed with flexible approximate posterior distribution for detecting adversarial examples. Furthermore, we study how robust the detection method is when Bayesian layers are located at the top or throughout the DNNs to determine the role of the network's hidden layers, and we compare the results with the deterministic ones. We show how BNNs offer a powerful, and practical method of detecting adversarial examples in comparison with deterministic approaches. Finally, we discuss the impact of having well-calibrated models as detectors and how non-gaussian priors enhance the performance of the detection.

Aunque las redes neuronales profundas (DNN, por sus siglas en inglés) han demostrado un excelente desempeño, se sabe que son vulnerables a ataques adversarios que contienen perturbaciones imperceptibles para el ser humano. Se han propuesto múltiples estrategias de defensa adversaria para mitigar este problema, pero a menudo estas muestran una viabilidad limitada en cuanto a eficiencia y manejo de ataques específicos. En este artículo, analizamos el desempeño de las redes neuronales bayesianas (BNN, por sus siglas en inglés) dotadas de una distribución a posteriori aproximada flexible para detectar ejemplos adversarios. Además, estudiamos la robustez del método de detección cuando las capas bayesianas se ubican en la parte superior o a lo largo de las DNN para determinar el papel de las capas ocultas de la red, y comparamos los resultados con los deterministas. Mostramos cómo las BNN ofrecen un método potente y práctico para detectar ejemplos adversarios en comparación con los enfoques deterministas. Finalmente, analizamos el impacto de contar con modelos bien calibrados como detectores y cómo las distribuciones a priori no gaussianas mejoran el desempeño de la detección.

References

Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D., Liu, L., Ghavamzadeh, M., Fieguth, P., Cao, X., Khosravi, A., Rajendra Acharya, U., Makarenkov, V. & Nahavandi, S. (2020), 'A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges', arXiv e-prints p. arXiv:2011.06225. DOI: https://doi.org/10.1016/j.inffus.2021.05.008

Andriushchenko, M., Croce, F., Flammarion, N. & Hein, M. (2019), 'Square Attack: a query-efficient black-box adversarial attack via random search', arXive-prints p. arXiv:1912.00049. DOI: https://doi.org/10.1007/978-3-030-58592-1_29

Ben Braiek, H., Reid, T. & Khomh, F. (2022), 'Physics-Guided Adversarial Machine Learning for Aircraft Systems Simulation', arXiv e-prints p. arXiv:2209.03431. DOI: https://doi.org/10.1109/TR.2022.3196272

Bortsova, G., González-Gonzalo, C., Wetstein, S. C., Dubost, F., Katramados, I., Hogeweg, L., Liefers, B., van Ginneken, B., Pluim, J. P. W., Veta, M., Sánchez, C. I. & de Bruijne, M. (2020), 'Adversarial Attack Vulnerability of Medical Image Analysis Systems: Unexplored Factors', arXiv e-prints p. arXiv:2006.06356. DOI: https://doi.org/10.1016/j.media.2021.102141

Carlini, N. & Wagner, D. (2016), 'Towards Evaluating the Robustness of Neural Networks', arXiv e-prints p. arXiv:1608.04644. DOI: https://doi.org/10.1109/SP.2017.49

Cohen, G., Sapiro, G. & Giryes, R. (2019), 'Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors', arXiv e-prints p. arXiv:1909.06872. DOI: https://doi.org/10.1109/CVPR42600.2020.01446

Craighero, F., Angaroni, F., Stella, F., Damiani, C., Antoniotti, M. & Graudenzi, A. (2021), 'Unity is strength: Improving the Detection of Adversarial Examples with Ensemble Approaches', arXiv e-prints p. arXiv:2111.12631.

Croce, F. & Hein, M. (2019), 'Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack', arXiv e-prints p. arXiv:1907.02044.

Croce, F. & Hein, M. (2020), 'Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks', arXiv e-prints p. arXiv:2003.01690.

Deng, L. (2012), 'The mnist database of handwritten digit images for machine learning research [best of the web]', IEEE Signal Processing Magazine 29(6), 141-142. DOI: https://doi.org/10.1109/MSP.2012.2211477

Deng, Z., Yang, X., Xu, S., Su, H. & Zhu, J. (2021a), 'LiBRe: A Practical Bayesian Approach to Adversarial Detection', arXiv e-prints p. arXiv:2103.14835.

Deng, Z., Yang, X., Xu, S., Su, H. & Zhu, J. (2021b), Libre: A practical bayesian approach to adversarial detection, in '2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)', IEEE Computer Society, Los Alamitos, CA, USA, pp. 972-982. https://doi.ieeecomputersociety.org/10.1109/CVPR46437.2021.00103 DOI: https://doi.org/10.1109/CVPR46437.2021.00103

Dinh, L., Sohl-Dickstein, J. & Bengio, S. (2016), 'Density estimation using Real NVP', arXiv e-prints p. arXiv:1605.08803.

Fiedler, F. & Lucia, S. (2023), 'Improved uncertainty quantification for neural networks with Bayesian last layer', arXiv e-prints p. arXiv:2302.10975. DOI: https://doi.org/10.1109/ACCESS.2023.3329685

Fortuin, V., Garriga-Alonso, A., Ober, S. W., Wenzel, F., Rätsch, G., Turner, R. E., van der Wilk, M. & Aitchison, L. (2021), 'Bayesian Neural Network Priors Revisited', arXiv e-prints p. arXiv:2102.06571.

Gal, Y. (2016), Uncertainty in Deep Learning, PhD thesis, University of Cambridge.

García-Farieta, J. E., Hortúa, H. J. & Kitaura, F.-S. (2023), 'Bayesian deep learning for cosmic volumes with modified gravity', arXiv e-prints p. arXiv:2309.00612. DOI: https://doi.org/10.1051/0004-6361/202347929

Goodfellow, I. J., Shlens, J. & Szegedy, C. (2014), 'Explaining and Harnessing Adversarial Examples', arXiv e-prints p. arXiv:1412.6572.

Graves, A. (2011), Practical variational inference for neural networks, in J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira & K. Q. Weinberger, eds, 'Advances in Neural Information Processing Systems', Vol. 24, Curran Associates, Inc., pp. 2348-2356.

Guesmi, A., Khasawneh, K. N., Abu-Ghazaleh, N. & Alouani, I. (2022), 'ROOM: Adversarial Machine Learning Attacks Under Real-Time Constraints', arXiv e- prints p. arXiv:2201.01621. DOI: https://doi.org/10.1109/IJCNN55064.2022.9892437

Henning, C., D'Angelo, F. & Grewe, B. F. (2021), 'Are Bayesian neural networks intrinsically good at out-of-distribution detection?', arXiv e-prints p. arXiv:2107.12248.

Huck Yang, C.-H., Ahmed, Z., Gu, Y., Szurley, J., Ren, R., Liu, L., Stolcke, A. & Bulyko, I. (2022), 'Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition', arXiv e-prints p. arXiv:2202.08532. DOI: https://doi.org/10.1109/ICASSP43922.2022.9746230

Hüllermeier, E. & Waegeman, W. (2019), 'Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods', arXiv e-prints p. arXiv:1910.09457.

Jimenez Rezende, D. & Mohamed, S. (2015), 'Variational Inference with Normalizing Flows', arXiv e-prints p. arXiv:1505.05770.

Laves, M.-H., Ihler, S., Kortmann, K.-P. & Ortmaier, T. (2019), 'Well-calibrated model uncertainty with temperature scaling for dropout variational inference'.

Lee, K., Lee, K., Lee, H. & Shin, J. (2018), 'A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks', arXiv e-prints p. arXiv:1807.03888.

Li, Y., Cheng, M., Hsieh, C.-J. & Lee, T. C. M. (2021), 'A Review of Adversarial Attack and Defense for Classi cation Methods', arXiv e-prints p. arXiv:2111.09961.

Li, Y., Tang, T., Hsieh, C.-J. & Lee, T. C. M. (2021), 'Detecting Adversarial Examples with Bayesian Neural Network', arXiv e-prints p. arXiv:2105.08620.

Louizos, C. & Welling, M. (2017), 'Multiplicative Normalizing Flows for Variational Bayesian Neural Networks', arXiv e-prints p. arXiv:1703.01961.

Lu, J., Issaranon, T. & Forsyth, D. (2017), 'SafetyNet: Detecting and Rejecting Adversarial Examples Robustly', arXiv e-prints p. arXiv:1704.00103. DOI: https://doi.org/10.1109/ICCV.2017.56

Ma, X., Li, B., Wang, Y., Erfani, S. M., Wijewickrema, S., Schoenebeck, G., Song, D., Houle, M. E. & Bailey, J. (2018), 'Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality', arXiv e-prints p. arXiv:1801.02613.

Madry, A., Makelov, A., Schmidt, L., Tsipras, D. & Vladu, A. (2017), 'Towards Deep Learning Models Resistant to Adversarial Attacks', arXiv e-prints p. arXiv:1706.06083.

Moosavi-Dezfooli, S.-M., Fawzi, A. & Frossard, P. (2015), 'DeepFool: a simple and accurate method to fool deep neural networks', arXiv e-prints p. arXiv:1511.04599. DOI: https://doi.org/10.1109/CVPR.2016.282

Mukherjee, S., Sen, B. & Sen, S. (2023), 'A Mean Field Approach to Empirical Bayes Estimation in High-dimensional Linear Regression', arXiv e-prints p. arXiv:2309.16843.

Simonyan, K. & Zisserman, A. (2014), 'Very Deep Convolutional Networks for Large-Scale Image Recognition', arXiv e-prints p. arXiv:1409.1556.

Smith, L. & Gal, Y. (2018), 'Understanding Measures of Uncertainty for Adversarial Example Detection', arXiv e-prints p. arXiv:1803.08533.

Wang, Y., Sun, T., Li, S., Yuan, X., Ni, W., Hossain, E. & Poor, H. V. (2023), 'Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey', arXiv e-prints p. arXiv:2303.06302. DOI: https://doi.org/10.1109/COMST.2023.3319492

Watson, J., Andreas Lin, J., Klink, P., Pajarinen, J. & Peters, J. (2021), Latent derivative bayesian last layer networks, in A. Banerjee & K. Fukumizu, eds, 'Proceedings of The 24th International Conference on Artificial Intelligence and Statistics', Vol. 130 of Proceedings of Machine Learning Research, PMLR, pp. 1198_1206. https://proceedings.mlr.press/v130/watson21a.html

Wu, H., Yunas, S., Rowlands, S., Ruan, W. & Wahlstrom, J. (2021), 'Adversarial Driving: Attacking End-to-End Autonomous Driving', arXiv e-prints p. arXiv:2103.09151.

Xiao, H., Rasul, K. & Vollgraf, R. (2017), 'Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms', arXiv e-prints p. arXiv:1708.07747.

Xie, Z., Brophy, J., Noack, A., You, W., Asthana, K., Perkins, C., Reis, S., Singh, S. & Lowd, D. (2022), 'Identifying Adversarial Attacks on Text Classifiers', arXiv e-prints p. arXiv:2201.08555.

Zhang, C., Butepage, J., Kjellstrom, H. & Mandt, S. (2017), 'Advances in Variational Inference', arXiv e-prints p. arXiv:1711.05597.

How to Cite

APA

Mora Valencia, A., Hortúa, H. J. & Ávila Moreno, D. N. (2025). Adversarial Image Detection Based on Bayesian Neural Layers. Revista Colombiana de Estadística, 48(2), 183–200. https://doi.org/10.15446/rce.v48n2.116416

ACM

[1]
Mora Valencia, A., Hortúa, H.J. and Ávila Moreno, D.N. 2025. Adversarial Image Detection Based on Bayesian Neural Layers. Revista Colombiana de Estadística. 48, 2 (Jul. 2025), 183–200. DOI:https://doi.org/10.15446/rce.v48n2.116416.

ACS

(1)
Mora Valencia, A.; Hortúa, H. J.; Ávila Moreno, D. N. Adversarial Image Detection Based on Bayesian Neural Layers. Rev. colomb. estad. 2025, 48, 183-200.

ABNT

MORA VALENCIA, A.; HORTÚA, H. J.; ÁVILA MORENO, D. N. Adversarial Image Detection Based on Bayesian Neural Layers. Revista Colombiana de Estadística, [S. l.], v. 48, n. 2, p. 183–200, 2025. DOI: 10.15446/rce.v48n2.116416. Disponível em: https://revistas.unal.edu.co/index.php/estad/article/view/116416. Acesso em: 17 nov. 2025.

Chicago

Mora Valencia, Andrés, Héctor J. Hortúa, and Diego Nicolás Ávila Moreno. 2025. “Adversarial Image Detection Based on Bayesian Neural Layers”. Revista Colombiana De Estadística 48 (2):183-200. https://doi.org/10.15446/rce.v48n2.116416.

Harvard

Mora Valencia, A., Hortúa, H. J. and Ávila Moreno, D. N. (2025) “Adversarial Image Detection Based on Bayesian Neural Layers”, Revista Colombiana de Estadística, 48(2), pp. 183–200. doi: 10.15446/rce.v48n2.116416.

IEEE

[1]
A. Mora Valencia, H. J. Hortúa, and D. N. Ávila Moreno, “Adversarial Image Detection Based on Bayesian Neural Layers”, Rev. colomb. estad., vol. 48, no. 2, pp. 183–200, Jul. 2025.

MLA

Mora Valencia, A., H. J. Hortúa, and D. N. Ávila Moreno. “Adversarial Image Detection Based on Bayesian Neural Layers”. Revista Colombiana de Estadística, vol. 48, no. 2, July 2025, pp. 183-00, doi:10.15446/rce.v48n2.116416.

Turabian

Mora Valencia, Andrés, Héctor J. Hortúa, and Diego Nicolás Ávila Moreno. “Adversarial Image Detection Based on Bayesian Neural Layers”. Revista Colombiana de Estadística 48, no. 2 (July 8, 2025): 183–200. Accessed November 17, 2025. https://revistas.unal.edu.co/index.php/estad/article/view/116416.

Vancouver

1.
Mora Valencia A, Hortúa HJ, Ávila Moreno DN. Adversarial Image Detection Based on Bayesian Neural Layers. Rev. colomb. estad. [Internet]. 2025 Jul. 8 [cited 2025 Nov. 17];48(2):183-200. Available from: https://revistas.unal.edu.co/index.php/estad/article/view/116416

Download Citation

CrossRef Cited-by

CrossRef citations0

Dimensions

PlumX

Article abstract page views

241

Downloads

Download data is not yet available.