Published
Adversarial Image Detection Based on Bayesian Neural Layers
Detección de imágenes adversarias basada en capas neuronales Bayesianas
DOI:
https://doi.org/10.15446/rce.v48n2.116416Keywords:
Adversarial Attacks, Bayesian Neural Networks (en)Ataques adversarios, Redes Neuronales Bayesianas (es)
Downloads
Although Deep Neural Networks (DNNs) have repeatedly shown excellent performance, they are known to be vulnerable to adversarial attacks that contain human-imperceptible perturbations. Multiple adversarial defense strategies have been proposed to alleviate this issue, but they often demonstrate restricted practicability regarding efficiency and handling solely specific attacks. In this paper, we analyze the performance of Bayesian Neural Networks (BNNs) endowed with flexible approximate posterior distribution for detecting adversarial examples. Furthermore, we study how robust the detection method is when Bayesian layers are located at the top or throughout the DNNs to determine the role of the network's hidden layers, and we compare the results with the deterministic ones. We show how BNNs offer a powerful, and practical method of detecting adversarial examples in comparison with deterministic approaches. Finally, we discuss the impact of having well-calibrated models as detectors and how non-gaussian priors enhance the performance of the detection.
Aunque las redes neuronales profundas (DNN, por sus siglas en inglés) han demostrado un excelente desempeño, se sabe que son vulnerables a ataques adversarios que contienen perturbaciones imperceptibles para el ser humano. Se han propuesto múltiples estrategias de defensa adversaria para mitigar este problema, pero a menudo estas muestran una viabilidad limitada en cuanto a eficiencia y manejo de ataques específicos. En este artículo, analizamos el desempeño de las redes neuronales bayesianas (BNN, por sus siglas en inglés) dotadas de una distribución a posteriori aproximada flexible para detectar ejemplos adversarios. Además, estudiamos la robustez del método de detección cuando las capas bayesianas se ubican en la parte superior o a lo largo de las DNN para determinar el papel de las capas ocultas de la red, y comparamos los resultados con los deterministas. Mostramos cómo las BNN ofrecen un método potente y práctico para detectar ejemplos adversarios en comparación con los enfoques deterministas. Finalmente, analizamos el impacto de contar con modelos bien calibrados como detectores y cómo las distribuciones a priori no gaussianas mejoran el desempeño de la detección.
References
Abdar, M., Pourpanah, F., Hussain, S., Rezazadegan, D., Liu, L., Ghavamzadeh, M., Fieguth, P., Cao, X., Khosravi, A., Rajendra Acharya, U., Makarenkov, V. & Nahavandi, S. (2020), 'A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges', arXiv e-prints p. arXiv:2011.06225. DOI: https://doi.org/10.1016/j.inffus.2021.05.008
Andriushchenko, M., Croce, F., Flammarion, N. & Hein, M. (2019), 'Square Attack: a query-efficient black-box adversarial attack via random search', arXive-prints p. arXiv:1912.00049. DOI: https://doi.org/10.1007/978-3-030-58592-1_29
Ben Braiek, H., Reid, T. & Khomh, F. (2022), 'Physics-Guided Adversarial Machine Learning for Aircraft Systems Simulation', arXiv e-prints p. arXiv:2209.03431. DOI: https://doi.org/10.1109/TR.2022.3196272
Bortsova, G., González-Gonzalo, C., Wetstein, S. C., Dubost, F., Katramados, I., Hogeweg, L., Liefers, B., van Ginneken, B., Pluim, J. P. W., Veta, M., Sánchez, C. I. & de Bruijne, M. (2020), 'Adversarial Attack Vulnerability of Medical Image Analysis Systems: Unexplored Factors', arXiv e-prints p. arXiv:2006.06356. DOI: https://doi.org/10.1016/j.media.2021.102141
Carlini, N. & Wagner, D. (2016), 'Towards Evaluating the Robustness of Neural Networks', arXiv e-prints p. arXiv:1608.04644. DOI: https://doi.org/10.1109/SP.2017.49
Cohen, G., Sapiro, G. & Giryes, R. (2019), 'Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors', arXiv e-prints p. arXiv:1909.06872. DOI: https://doi.org/10.1109/CVPR42600.2020.01446
Craighero, F., Angaroni, F., Stella, F., Damiani, C., Antoniotti, M. & Graudenzi, A. (2021), 'Unity is strength: Improving the Detection of Adversarial Examples with Ensemble Approaches', arXiv e-prints p. arXiv:2111.12631.
Croce, F. & Hein, M. (2019), 'Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack', arXiv e-prints p. arXiv:1907.02044.
Croce, F. & Hein, M. (2020), 'Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks', arXiv e-prints p. arXiv:2003.01690.
Deng, L. (2012), 'The mnist database of handwritten digit images for machine learning research [best of the web]', IEEE Signal Processing Magazine 29(6), 141-142. DOI: https://doi.org/10.1109/MSP.2012.2211477
Deng, Z., Yang, X., Xu, S., Su, H. & Zhu, J. (2021a), 'LiBRe: A Practical Bayesian Approach to Adversarial Detection', arXiv e-prints p. arXiv:2103.14835.
Deng, Z., Yang, X., Xu, S., Su, H. & Zhu, J. (2021b), Libre: A practical bayesian approach to adversarial detection, in '2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)', IEEE Computer Society, Los Alamitos, CA, USA, pp. 972-982. https://doi.ieeecomputersociety.org/10.1109/CVPR46437.2021.00103 DOI: https://doi.org/10.1109/CVPR46437.2021.00103
Dinh, L., Sohl-Dickstein, J. & Bengio, S. (2016), 'Density estimation using Real NVP', arXiv e-prints p. arXiv:1605.08803.
Fiedler, F. & Lucia, S. (2023), 'Improved uncertainty quantification for neural networks with Bayesian last layer', arXiv e-prints p. arXiv:2302.10975. DOI: https://doi.org/10.1109/ACCESS.2023.3329685
Fortuin, V., Garriga-Alonso, A., Ober, S. W., Wenzel, F., Rätsch, G., Turner, R. E., van der Wilk, M. & Aitchison, L. (2021), 'Bayesian Neural Network Priors Revisited', arXiv e-prints p. arXiv:2102.06571.
Gal, Y. (2016), Uncertainty in Deep Learning, PhD thesis, University of Cambridge.
García-Farieta, J. E., Hortúa, H. J. & Kitaura, F.-S. (2023), 'Bayesian deep learning for cosmic volumes with modified gravity', arXiv e-prints p. arXiv:2309.00612. DOI: https://doi.org/10.1051/0004-6361/202347929
Goodfellow, I. J., Shlens, J. & Szegedy, C. (2014), 'Explaining and Harnessing Adversarial Examples', arXiv e-prints p. arXiv:1412.6572.
Graves, A. (2011), Practical variational inference for neural networks, in J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira & K. Q. Weinberger, eds, 'Advances in Neural Information Processing Systems', Vol. 24, Curran Associates, Inc., pp. 2348-2356.
Guesmi, A., Khasawneh, K. N., Abu-Ghazaleh, N. & Alouani, I. (2022), 'ROOM: Adversarial Machine Learning Attacks Under Real-Time Constraints', arXiv e- prints p. arXiv:2201.01621. DOI: https://doi.org/10.1109/IJCNN55064.2022.9892437
Henning, C., D'Angelo, F. & Grewe, B. F. (2021), 'Are Bayesian neural networks intrinsically good at out-of-distribution detection?', arXiv e-prints p. arXiv:2107.12248.
Huck Yang, C.-H., Ahmed, Z., Gu, Y., Szurley, J., Ren, R., Liu, L., Stolcke, A. & Bulyko, I. (2022), 'Mitigating Closed-model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition', arXiv e-prints p. arXiv:2202.08532. DOI: https://doi.org/10.1109/ICASSP43922.2022.9746230
Hüllermeier, E. & Waegeman, W. (2019), 'Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods', arXiv e-prints p. arXiv:1910.09457.
Jimenez Rezende, D. & Mohamed, S. (2015), 'Variational Inference with Normalizing Flows', arXiv e-prints p. arXiv:1505.05770.
Laves, M.-H., Ihler, S., Kortmann, K.-P. & Ortmaier, T. (2019), 'Well-calibrated model uncertainty with temperature scaling for dropout variational inference'.
Lee, K., Lee, K., Lee, H. & Shin, J. (2018), 'A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks', arXiv e-prints p. arXiv:1807.03888.
Li, Y., Cheng, M., Hsieh, C.-J. & Lee, T. C. M. (2021), 'A Review of Adversarial Attack and Defense for Classi cation Methods', arXiv e-prints p. arXiv:2111.09961.
Li, Y., Tang, T., Hsieh, C.-J. & Lee, T. C. M. (2021), 'Detecting Adversarial Examples with Bayesian Neural Network', arXiv e-prints p. arXiv:2105.08620.
Louizos, C. & Welling, M. (2017), 'Multiplicative Normalizing Flows for Variational Bayesian Neural Networks', arXiv e-prints p. arXiv:1703.01961.
Lu, J., Issaranon, T. & Forsyth, D. (2017), 'SafetyNet: Detecting and Rejecting Adversarial Examples Robustly', arXiv e-prints p. arXiv:1704.00103. DOI: https://doi.org/10.1109/ICCV.2017.56
Ma, X., Li, B., Wang, Y., Erfani, S. M., Wijewickrema, S., Schoenebeck, G., Song, D., Houle, M. E. & Bailey, J. (2018), 'Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality', arXiv e-prints p. arXiv:1801.02613.
Madry, A., Makelov, A., Schmidt, L., Tsipras, D. & Vladu, A. (2017), 'Towards Deep Learning Models Resistant to Adversarial Attacks', arXiv e-prints p. arXiv:1706.06083.
Moosavi-Dezfooli, S.-M., Fawzi, A. & Frossard, P. (2015), 'DeepFool: a simple and accurate method to fool deep neural networks', arXiv e-prints p. arXiv:1511.04599. DOI: https://doi.org/10.1109/CVPR.2016.282
Mukherjee, S., Sen, B. & Sen, S. (2023), 'A Mean Field Approach to Empirical Bayes Estimation in High-dimensional Linear Regression', arXiv e-prints p. arXiv:2309.16843.
Simonyan, K. & Zisserman, A. (2014), 'Very Deep Convolutional Networks for Large-Scale Image Recognition', arXiv e-prints p. arXiv:1409.1556.
Smith, L. & Gal, Y. (2018), 'Understanding Measures of Uncertainty for Adversarial Example Detection', arXiv e-prints p. arXiv:1803.08533.
Wang, Y., Sun, T., Li, S., Yuan, X., Ni, W., Hossain, E. & Poor, H. V. (2023), 'Adversarial Attacks and Defenses in Machine Learning-Powered Networks: A Contemporary Survey', arXiv e-prints p. arXiv:2303.06302. DOI: https://doi.org/10.1109/COMST.2023.3319492
Watson, J., Andreas Lin, J., Klink, P., Pajarinen, J. & Peters, J. (2021), Latent derivative bayesian last layer networks, in A. Banerjee & K. Fukumizu, eds, 'Proceedings of The 24th International Conference on Artificial Intelligence and Statistics', Vol. 130 of Proceedings of Machine Learning Research, PMLR, pp. 1198_1206. https://proceedings.mlr.press/v130/watson21a.html
Wu, H., Yunas, S., Rowlands, S., Ruan, W. & Wahlstrom, J. (2021), 'Adversarial Driving: Attacking End-to-End Autonomous Driving', arXiv e-prints p. arXiv:2103.09151.
Xiao, H., Rasul, K. & Vollgraf, R. (2017), 'Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms', arXiv e-prints p. arXiv:1708.07747.
Xie, Z., Brophy, J., Noack, A., You, W., Asthana, K., Perkins, C., Reis, S., Singh, S. & Lowd, D. (2022), 'Identifying Adversarial Attacks on Text Classifiers', arXiv e-prints p. arXiv:2201.08555.
Zhang, C., Butepage, J., Kjellstrom, H. & Mandt, S. (2017), 'Advances in Variational Inference', arXiv e-prints p. arXiv:1711.05597.
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
License

This work is licensed under a Creative Commons Attribution 4.0 International License.
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).






