Publicado

2019-10-01

EspiNet V2: a region based deep learning model for detecting motorcycles in urban scenarios

EspiNet V2: un modelo basado en regiones de aprendizaje profundo para detectar motocicletas en escenarios urbanos

Palabras clave:

vehicle detection, motorcycle detection, Faster R-CNN, region-based detectors, convolutional neural network, deep learning (en)
detección de vehículos, detección de motocicletas, Faster R-CNN, detectores basados en regiones, redes neuronales convolucionales, aprendizaje profundo (es)

Autores/as

This paper presents “EspiNet V2” a Deep Learning model, based on the region-based detector Faster R-CNN. The model is used for the detection of motorcycles in urban environments, where occlusion is likely. For training, two datasets are used: the Urban Motorbike Dataset (UMD-10K) of 10,000 annotated images, and the new SMMD (Secretaría de Movilidad Motorbike Dataset), of 5,000 images captured from the Traffic Control CCTV System in Medellín (Colombia). Results achieved on the UMD-10K dataset reach 88.8% in average precision (AP) even when 60% motorcycles were occluded, and the images were captured from a low angle and a moving camera. Meanwhile, an AP of 79.5% is reached for SSMD. EspiNet V2 outperforms popular models such as YOLO V3 and Faster R-CNN (VGG16 based) trained end-to-end for those datasets

Este artículo presenta "EspiNet V2", un modelo de aprendizaje profundo, fundamentado en el detector basado regiones Faster R-CNN. El modelo es usado para la detección de motocicletas en entornos urbanos, donde se presenta algún nivel de oclusión. Para el entrenamiento de dicho modelo, se utilizaron dos conjuntos de datos: el conjunto de datos de motocicletas urbanas (UMD-10K) que cuenta con 10,000 imágenes anotadas, y el nuevo conjunto de datos de motos de la Secretaría de Movilidad (SMMD), con 5,000 imágenes capturadas obtenidas del Sistema CCTV de Control de Tráfico de la ciudad de Medellín (Colombia). Los resultados obtenidos en el conjunto de datos UMD-10K alcanzan el 88.8% en precisión promedio (AP), incluso con niveles de oclusión de un 60 %, utilizando imágenes capturadas desde un ángulo bajo y desde una cámara en movimiento. Por otro lado se alcanza un AP de 79.5 % para conjunto de datos de motos de la Secretaría de Movilidad (SMMD). EspiNet V2 supera modelos populares como YOLO V3 y Faster R-CNN (basado en VGG16), siendo estos entrenados de extremo a extremo utilizando los conjuntos de datos mencionados. 

Citas

WHO, Global status report on road safety, [Online]. 2018, WHO.

[Accessed: June 10th of 2019]. Available at: http://www.who.int/

violence_injury_prevention/road_safety_status/2018/en/.

Accidentes de tránsito en la Comunidad Andina, 2007-2016, 48 P.

Así Vamos en Salud., Mortalidad por accidentes de tránsito, [Online].2018. [Accessed: 2August 23th of 2018]. Available at:

https://www.asivamosensalud.org/salud-para-ciudadanos/mortalidadpor-accidentes-de-transito.

RUNT. Estadísticas del RUNT, [Online]. Accessed: August 09th of

. Available at: https://www.runt.com.co/cifras

IDEAM. Calidad del aire, [Online]. [Accessed: August 09th of 2019].Available at: http://www.ideam.gov.co/web/contaminacion-y-calidadambiental/calidad-del-aire.

Walsh, M.P., PM 2.5: global progress in controlling the motor vehicle contribution, Front. Environ. Sci. Eng., 8(1), pp. 1-17, 2014. DOI:10.1007/s11783-014-0634-4

Ren, S., He, K., Girshick, R. and Sun, J., Faster r-cnn: towards realtime object detection with region proposal networks, in: Advances in neural information processing systems, [online]. 2015, pp. 91-99.

Available at: http://papers.nips.cc/paper/5638-faster-r-cnn-towardsreal-time-object-detection-with-region-proposal-networks

Tian, B. et al., Hierarchical and networked vehicle surveillance in ITS: a survey, IEEE Trans. Intell. Transp. Syst., 18(1), pp. 25-48, 2017.DOI: 10.1109/TITS.2016.2552778

Le, T.S. and Huynh, C.K., An unified framework for motorbike

counting and detecting in traffic videos, in: 2015 International

Conference on Advanced Computing and Applications (ACOMP),

, pp. 162-168. DOI: 10.1109/ACOMP.2015.32

Duan B., Liu W., Fu P., Yang C., Wen X., and Yuan H., Real-time onroad vehicle and motorcycle detection using a single camera, in

Industrial Technology, 2009. ICIT 2009. IEEE International

Conference on, 2009, pp. 1-6. DOI: 10.1109/ICIT.2009.4939585

Muzammel, M., Yusoff, M.Z. and Meriaudeau, F., Rear-end visionbased collision detection system for motorcyclists, J. Electron.

Imaging, 26(3), pp. 033002, 2017. DOI: 10.1117/1.JEI.26.3.033002

Shuo, Y. and Choi, E.-J., A driving support system base on traffic

environment analysis, Indian J. Sci. Technol., 9(47), 2016. DOI:

17485/ijst/2016/v9i47/108374

Wonghabut, P., Kumphong, J., Satiennam,, T., Ung-arunyawee R. and Leelapatra, W., Automatic helmet-wearing detection for law

enforcement using CCTV cameras, in: IOP Conference Series: Earth

and Environmental Science, 2018, 143, pp. 012063. DOI:

1088/1755-1315/143/1/012063

Dahiya, K., Singh, D. and Mohan, C.K., Automatic detection of bikeriders without helmet using surveillance videos in real-time, in: 2016 International Joint Conference on Neural Networks (IJCNN), 2016, pp.3046-3051. DOI: 10.1109/IJCNN.2016.7727586

Singh, D., Vishnu, C. and Mohan, C.K., Visual big data analytics for traffic monitoring in smart city, in: 2016 15th IEEE International

Conference on Machine Learning and Applications (ICMLA), 2016,

pp. 886-891. DOI: 10.1109/ICMLA.2016.0159

e Silva, R.R., Aires, K.R. and Veras, R. de MS, Detection of helmets

on motorcyclists, Multimed. Tools Appl., 77(5), pp. 5659-5683, 2017.

DOI: 10.1007/s11042-017-4482-7

Wu, H. and Zhao, J., An intelligent vision-based approach for helmet identification for work safety, Comput. Ind., 100, pp. 267-277, 2018. DOI: 10.1016/j.compind.2018.03.037

Messelodi, S., Modena C.M. and Cattoni, G., Vision-based

bicycle/motorcycle classification, Pattern Recognit. Lett., 28(13), pp.

-1726, 2007. DOI: 10.1016/j.patrec.2007.04.014

Buch, N., Orwell, J. and Velastin, S.A., Urban road user detection and classification using 3D wire frame models, IET Comput. Vis., 4(2), pp.105-116, 2010. DOI: 10.1049/iet-cvi.2008.0089

Chiu, C.-C., Ku, M.-Y. and Chen, H.-T., Motorcycle detection and

tracking system with occlusion segmentation, in: Image Analysis for

Multimedia Interactive Services, 2007. WIAMIS07. Eighth

International Workshop on, 2007, pp. 32-32. DOI:

1109/WIAMIS.2007.60

Ku, M.-Y., Chiu, C.-C., Chen, H.-T. and Hong, S.-H., Visual

motorcycle detection and tracking algorithms, WSEAS Trans.

Electron., [online]. pp. 121-131, 2008. Available at:

http://www.wseas.us/e-library/transactions/electronics/2008/30-

pdf

Stauffer, C. and Grimson, W.E.L., Adaptive background mixture

models for real-time tracking, in: Computer Vision and Pattern

Recognition, 1999. IEEE Computer Society Conference on., 1999, pp.

-252. DOI: 10.1109/CVPR.1999.784637

Waranusast, R., Bundon, N., Timtong, V., Tangnoi, C. and

Pattanathaburt, P., Machine vision techniques for motorcycle safety

helmet detection, in: 28th International Conference on Image and

Vision Computing New Zealand (IVCNZ 2013), 2013, pp. 35-40.

DOI: 10.1109/IVCNZ.2013.6726989

Rashidan, M.A., Mustafah, Y.M., Shafie, A.A., Zainuddin, N.A., Aziz, N.N.A. and Azman, A.W., Moving object detection and classification using Neuro-Fuzzy approach, Int. J. Multimed. Ubiquitous Eng., 11(4), pp. 253-266, 2016. DOI: 10.14257/ijmue.2016.11.4.26

Chen, Z. and Ellis, T., Self-adaptive Gaussian mixture model for urban traffic monitoring system, in: IEEE International Conference on

Computer Vision Workshops (ICCV Workshops), 2011, pp. 1769-

DOI: 10.1109/ICCVW.2011.6130463

Chen, Z., Ellis, T. and Velastin, S.A., Vehicle detection, tracking and classification in urban traffic, in: 15th International IEEE Conference on Intelligent Transportation Systems, 2012, pp. 951-956. DOI: 10.1109/ITSC.2012.6338852

Chiverton, J., Helmet presence classification with motorcycle

detection and tracking, Intell. Transp. Syst. IET, 6(3), pp. 259-269,

DOI: 10.1049/iet-its.2011.0138

Thai, N.D., Le, T.S., Thoai, N. and Hamamoto, K., Learning bag of

visual words for motorbike detection, in: 13th International Conference on Control Automation Robotics Vision (ICARCV), 2014, pp. 1045-1050. DOI: 10.1109/ICARCV.2014.7064450

Mukhtar, A. and Tang, T.B., Vision based motorcycle detection using HOG features, in: IEEE International Conference on Signal and Image Processing Applications (ICSIPA), 2015, pp. 452-456. DOI:

1109/ICSIPA.2015.7412234

Dupuis, Y., Subirats, P. and Vasseur, P., Robust image segmentation for overhead real time motorbike counting, in: IEEE 17th International Conference on Intelligent Transportation Systems (ITSC), 2014, pp. 3070-3075. DOI: 10.1109/ITSC.2014.6958183

Sutikno, S., Waspada, I., Bahtiar, N. and Sasongko, P.S., Classification of motorcyclists not wear helmet on digital image with

backpropagation Neural Network, TELKOMNIKA Telecommun.

Comput. Electron. Control, 14(3), pp. 1128-1133, 2016. DOI:

12928/telkomnika.v14i3.3486

Vishnu, C., Singh, D., Mohan, C.K. and Babu, S., Detection of

motorcyclists without helmet in videos using convolutional neural

network, in: International Joint Conference on Neural Networks

(IJCNN), 2017, pp. 3036-3041. DOI:10.1109/IJCNN.2017.7966233

Espinosa, J.E., Velastin, S.A. and Branch, J.W., Vehicle detection

using Alex Net and Faster R-CNN deep learning models: a

comparative study, in: International Visual Informatics Conference,

, pp. 3-15. DOI: 10.1007/978-3-319-70010-6_1

Adu-Gyamfi, Y.O., Asare, S.K., Sharma, A. and Titus, T., Automated vehicle recognition with deep convolutional Neural Networks,Transportation Research Record: Journal of the Transportation Research Board 2645(1), pp. 113-122, 2017. DOI: 10.3141/2645-13

Huynh, C.K., Le, T.S. and Hamamoto, K., Convolutional neural

network for motorbike detection in dense traffic, in: IEEE Sixth

International Conference on Communications and Electronics (ICCE),

, pp. 369-374. DOI: 10.1109/CCE.2016.7562664

Ra,j K.C.D., Chairat, A., Timtong, V., Dailey, M.N. and Ekpanyapong, M., Helmet violation processing using deep learning, in: International Workshop on Advanced Image Technology (IWAIT), 2018, pp. 1-4. DOI: 10.1109/IWAIT.2018.8369734

Wu, H. and Zhao, J., Automated visual helmet identification based on deep convolutional neural networks, in: Computer Aided Chemical Engineering, 44, Elsevier, 2018, pp. 2299-2304. DOI: 10.1016/B978-0-444-64241-7.50378-5

Deng, J., Dong, W., Socher, R., Li, L.J., Li, K. and Fei-Fei, L.,

ImageNet: a large-scale hierarchical image database, in: IEEE

Conference on Computer Vision and Pattern Recognition, 2009 -

CVPR 2009, 2009, pp. 248-255. DOI: 10.1109/CVPR.2009.5206848

Zeiler, M.D. and Fergus, R., Visualizing and understanding

convolutional networks, in: European Conference on Computer

Vision, 2014, pp. 818-833. DOI: 10.1007/978-3-319-10590-1_53

Lampert, C.H., Blaschko, M.B. and Hofmann, T., Efficient subwindow search: a branch and bound framework for object localization, IEEE Trans. Pattern Anal. Mach. Intell., 31(12), pp. 2129-2142, 2009. DOI: 10.1109/TPAMI.2009.144

Uijlings, J.R., Van De Sande, K.E., Gevers, T. and Smeulders, A.W.,

Selective search for object recognition, Int. J. Comput. Vis., 104(2),

pp. 154-171, 2013. DOI: 10.1007/s11263-013-0620-5

He, K., Zhang, X., Ren, S. and Sun, J., Spatial pyramid pooling in deep convolutional Networks for visual recognition, IEEE Trans. Pattern Anal. Mach. Intell., 37(9), pp. 1904-1916, 2015. DOI:

1109/TPAMI.2015.2389824

Zitnick, C.L. and Dollár, P., Edge boxes: locating object proposals

from edges, in: European Conference on Computer Vision, 2014, pp.

-405. DOI: 10.1007/978-3-319-10602-1_26

Girshick, R., Donahue, J., Darrell, T. and Malik, J., Rich feature

hierarchies for accurate object detection and semantic segmentation,

in: IEEE Conference on Computer Vision and Pattern Recognition,

, pp. 580-587. DOI: 10.1109/CVPR.2014.81

Girshick, R., Fast r-cnn, in: Proceedings of the IEEE International

Conference on Computer Vision, 2015, pp. 1440-1448. DOI:

1109/ICCV.2015.169

Fan, Q., Brown, L. and Smith, J., A closer look at Faster R-CNN for

vehicle detection, in: IEEE Intelligent Vehicles Symposium (IV),

, pp. 124-129. DOI: 10.1109/IVS.2016.7535375

Geiger, A., Lenz, P. and Urtasun, R., Are we ready for autonomous

driving?. The kitti vision benchmark suite, in: Computer Vision and

Pattern Recognition (CVPR), 2012 IEEE Conference on, 2012, pp.

-3361. DOI: 10.1109/CVPR.2012.6248074

Espinosa, J.E., Velastin, S.A. and Branch, J.W., Motorcycle detection

and classification in urban Scenarios using a model based on Faster RCNN, in: 9th International Conference on Pattern Recognition Systems (ICPRS 2018), 2018, 6 P., ArXiv180802299 Cs, 2018. DOI:

1049/cp.2018.1292

Huang, J. et al., Speed/accuracy trade-offs for modern convolutional object detectors, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. ArXiv161110012 Cs, 2017. DOI:10.1109/CVPR.2017.351

Donahue, J. et al., DeCAF: A deep convolutional activation feature for generic visual recognition, in: ICML, [online]. 2014, pp. 647-655. Available at http://www.jmlr.org/proceedings/papers/v32/

donahue14.pdf

Romanuke, V.V., Appropriate number of standard 2 X 2 max pooling layers and their allocation in convolutional neural networks for diverse and heterogeneous datasets, Inf. Technol. Manag. Sci., 20(1), pp. 12-19, 2017. DOI: 10.1515/itms-2017-0002

SIMM. Cámaras de CCTV. [Online]. [Accessed: October 31st of 2018].Available at: https://www.medellin.gov.co/simm/camaras-de-circuitocerrado.

Everingham, M., Van Gool, L., Williams, C.K., Winn, J. and

Zisserman, A., The pascal visual object classes (voc) challenge, Int. J.

Comput. Vis., 88(2), pp. 303-338, 2010. DOI: 10.1007/s11263-009-

-4

Redmon, J. and Farhadi, A., YOLOv3: an incremental improvement, Tech. Report, in: Computer Vision and Pattern Recognition (cs.CV), [online]. 2018, 6 P. ArXiv180402767 Cs, Available at: http://arxiv.org/abs/1804.02767

Ng, A., Machine learning yearning, URL Httpwww Mlyearning

Org96, 2017.

Yin, F., Makris, D. and Velastin, S.A., Performance evaluation of

object tracking algorithms, in: IEEE International Workshop on

Performance Evaluation of Tracking and Surveillance, Rio De Janeiro,

Brazil, [online]. 2007. Available at: https://pdfs.semanticscholar.org/

ad76/bdc7d06a7ec496ac788d667c6ad5fcc0fe41.pdf

Espinosa-Oviedo, J.E., Detection and tracking of motorcycles in urban environments by using video sequences with high level of oclussion, PhD Thesis, Universidad Nacional de Colombia, Medellín campus, Medellín, Colombia, 2019.