Machine learning for the textural classification of soils based on the particle size distribution
Machine learning para la clasificación textural de suelos basada en la distribución del tamaño de las partículas
DOI:
https://doi.org/10.15446/agron.colomb.v43n3.119825Keywords:
tropical soils, soil texture, supervised classification, Random Forest (en)suelos tropicales, textura del suelo, clasificación supervisada, Random Forest (es)
Downloads
Soil texture, defined as the relative proportion of sand, silt, and clay, is a fundamental property that influences water retention, fertility, and soil classification. The use of machine learning algorithms to classify soil textures in Cuban sugarcane fields can contribute to sustainable agricultural management and accurate digital mapping. The aim of this study was to use machine learning algorithms for soil textural classification based on particle-size distribution. A total of 109 soil samples were collected at different depths within the solum (A + B horizons) from various soil types in sugarcane fields belonging to the “Cristino Naranjo” Agroindustrial Sugar Company, located in the municipality of Cacocum, Holguín Province, Cuba. Soil particle size was determined by mechanical analysis using the pipette method. A mechanical or textural analysis of soils was performed to classify the samples according to their clay, silt, and sand content into four textural classes: clay, clay loam, silty loam, and loam. Orange Data Mining software version 3.36.2 was used for machine learning. Unsupervised classification methods were applied, and six models were evaluated: Decision Tree, Logistic Regression, Support Vector Machine, k-Nearest Neighbors, Naive Bayes, and Random Forest. Random Forest proved to be the most accurate, stable, and reliable algorithm for textural classification based on particle size distribution, establishing reproducible qualitative thresholds.
La textura del suelo es el contenido relativo de arena, limo y arcilla y es una propiedad fundamental que influye en la retención hídrica, fertilidad y clasificación de suelos. El uso de algoritmos de machine learning para clasificar texturas en suelos cañeros cubanos puede contribuir a la gestión agrícola sostenible y cartografía digital precisa. El objetivo de este estudio fue utilizar algoritmos de machine learning para la clasificación textural de suelos basada en la distribución del tamaño de las partículas. Se tomaron 109 muestras a distintas profundidades en el solum (horizontes A + B) de diferentes tipos de suelos, en áreas plantadas con caña de azúcar pertenecientes a la Empresa Agroindustrial Azucarera “Cristino Naranjo”, municipio de Cacocum, provincia de Holguín, Cuba. Se determinó el tamaño de las partículas de suelo (análisis mecánico) por el método de la pipeta. Se realizó un análisis de la composición mecánica o textural del suelo para clasificar las muestras de acuerdo al contenido de arcilla, limo y arena, en cuatro clases texturales: arcilloso, loam arcilloso, loam limoso y loam. Para el desarrollo del aprendizaje automático (machine learning) se utilizó el software Orange Data Mining versión 3.36.2. Se aplicaron métodos de clasificación no supervisada y se evaluaron seis modelos: Decision Tree, Regresión Logística, Support Vector Machine, k Nearest Neighbors, Naïve Bayes y Random Forest. Random Forest demostró ser el algoritmo más preciso, estable y confiable para la clasificación textural basada en la composición granulométrica, al establecer umbrales cualitativos reproducibles.
References
Aarthi, R., & Sivakumar, D. (2020). An enhanced agricultural data mining technique for dynamic soil texture prediction. Procedia Computer Science, 171, 2770−2778. https://doi.org/10.1016/j.procs.2020.04.301
Alexakis, D. D., Tapoglou, E., Vozinaki, A.-Eirini. K., & Tsanis, I. K. (2019). Integrated use of satellite remote sensing, artificial neural networks, field spectroscopy, and GIS in estimating crucial soil parameters in terms of soil erosion. Remote Sensing, 11(9), Article 1106. https://doi.org/10.3390/rs11091106
Amirian-Chakan, A., Minasny, B., Taghizadeh-Mehrjardi, R., Akbarifazli, R., Darvishpasand, Z., & Khordehbin, S. (2019). Some practical aspects of predicting texture data in digital soil mapping. Soil and Tillage Research, 194, Article 104289. https://doi.org/10.1016/j.still.2019.06.006
Bae, S. G., Yeon, I. K., Park, S. D., Kang, C. K., & Zakaullah, K. (2004). Effects of soil textures by soil addition on the growth and quality of oriental melon (Cucumis melo L. var. makuw Mak.) under protected cultivation. Journal of Bio-Environment Control, 13(3), 156−161. https://koreascience.kr/article/JAKO200411922995693.page
Barreñada, L., Dhiman, P., Timmerman, D., Boulesteix, A.-L., & Van Calster, B. (2024). Understanding overfitting in random forest for probability estimation: a visualization and simulation study. Diagnostic and Prognostic Research, 8, Article 14. https://doi.org/10.1186/s41512-024-00177-1
Beck, F., Burch, M., Munz, T., Silvestro, L., & Weiskopf, D. (2014). Generalized Pythagoras trees for visualizing hierarchies. Proceedings of the 5th International Conference on Information Visualization Theory and Applications (VISIGRAPP 2014) – IVAPP; 17−28. https://doi.org/10.5220/0004654500170028
Belkina, A. C., Ciccolella, C. O., Anno, R., Halpert, R., Spidlen, J., & Snyder-Cappione, J. E. (2019). Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets. Nature Communications, 10, Article 5415. https://doi.org/10.1038/s41467-019-13055-y
Botula, Y.-D., Van Ranst, E., & Cornelis, W. M. (2014). Pedotransfer functions to predict water retention for soils of the humid tropics: a review. Revista Brasileira de Ciência do Solo, 38(3), 679−698. https://doi.org/10.1590/S0100-06832014000300001
Cairo, C. P., & Herrera, F. O. (2007). Edafología. Editorial Félix Varela.
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over the F1 score and accuracy in binary classification evaluation. BMC Genomics, 21, Article 6. https://doi.org/10.1186/s12864-019-6413-7
Cid-Lazo, G., Herrera-Puebla, J., López-Seijas, T., & González-Robaina, F. (2018). Estimación del agua disponible para las plantas en suelos cubanos en función de la textura predominante. Revista Ciencias Técnicas Agropecuarias, 27(4), 1−7. http://www.scielo.sld.cu/pdf/rcta/v27n4/es_2071-0054-rcta-27-04-e03.pdf
Corral-Pazos-de-Provens, E., Rapp-Arraras, I., & Domingo-Santos, J. M. (2022). Estimating textural fractions of the USDA using those of the international system: A quantile approach. Geoderma, 416, Article 115783. https://doi.org/10.1016/j.geoderma.2022.115783
Dexter, A. R. (2004). Soil physical quality: Part I. Theory, effects of soil texture, density, and organic matter, and effects on root growth. Geoderma, 120(3-4), 201−214. https://doi.org/10.1016/j.geoderma.2003.09.004
Dirección General de Suelos y Fertilizantes. (1984). Manual de interpretación de los índices físico-químicos y morfológicos de los suelos cubanos. Editorial Científico-Técnica.
Dudek, M., Waroszewski, J., Kabała, C., & Łabaz, B. (2019). Vertisols properties and classification in relation to parent material differentiation near Strzelin (SW Poland). Soil Science Annual, 70(2), 158−169. https://doi.org/10.2478/ssa-2019-0014
Feng, L., Khalil, U., Aslam, B., Ghaffar, B., Tariq, A., Jamil, A., Farhan, M., Aslam, M., & Soufan, W. (2024). Evaluation of soil texture classification from orthodox interpolation and machine learning techniques. Environmental Research, 246, Article 118075. https://doi.org/10.1016/j.envres.2023.118075
Gubiani, P. I., Santos, V. P., Mulazzani, R. P., Suzuki, L. E. A. S., Drescher, M. S., Zwirtes, A. L., Koppe, E., Pereira, C. A., Mentges, L. R., Galarza, R. M., Boeno, D., Eurich, K., Bitencourt Junior, D., Marcolin, C. D., & Müller, E. A. (2024). Relationship between plant-available water and soil compaction in Brazilian soils. Sustainability, 16(16), Article 6951. https://doi.org/10.3390/su16166951
Hernández, J. L. (2007). Métodos para el análisis físico de los suelos. Ediciones INCA.
Hristov, B. (2013). The importance of soil texture and soil classification systems. Journal of Balkan Ecology, 16(2), 137−139.https://www.researchgate.net/publication/268802378_Importance_of_soil_texture_in_Soil_Classification_systems
Hubert., Phoenix, H. P., Sudaryono, R., & Suhartono, D. (2021). Classifying promotion images using Optical Character Recognition and Naïve Bayes classifier. Procedia Computer Science, 179, 498−506. https://doi.org/10.1016/j.procs.2021.01.033
IUSS Working Group WRB. (2022). World Reference Base for Soil Resources. International soil classification system for naming soils and creating legends for soil maps (4th ed.). International Union of Soil Sciences (IUSS).
Jijo, B. T., & Abdulazeez, A. M. (2021). Classification based on a decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(1), 20−28. https://doi.org/10.38094/jastt20165
Kottek, M., Grieser, J., Beck, C., Rudolf, B., & Rubel, F. (2006). World Map of the Köppen-Geiger climate classification updated. Meteorologische Zeitschrift, 15(3), 259–263. https://doi.org/10.1127/0941-2948/2006/0130
Liu, C. Y., Ku, C. Y., Wu, T. Y., & Ku, Y. C. (2024). An advanced soil classification method employing the random forest technique in machine learning. Applied Sciences, 14, Article 7202. https://doi.org/10.3390/app14167202
Liu, L.-Ch., & Ma, X. (2024). Prediction of soil field capacity and permanent wilting point using accessible parameters by machine learning. AgriEngineering, 6(3), 2592−2611. https://doi.org/10.3390/agriengineering6030151
Maino, A., Alberi, M., Barbagli, A., Chiarelli, E., Colonna, T., Franceschi, M., Gallorini, F., Guastaldi, E., Lopane, N., Mantovani, F., Petrone, D., Pierini, S., Raptis, K. G. C., Strati, V., & Xhixha, G. (2024). A deep neural network for predicting soil texture using airborne radiometric data. Radiation Physics and Chemistry, 221, Article 111767. https://doi.org/10.1016/j.radphyschem.2024.111767
Mallah, S., Delsouz Khaki, B., Davatgar, N., Scholten, T., Amirian-Chakan, A., Emadi, M., Kerry, R., Mosavi, A. H., & Taghizadeh-Mehrjardi, R. (2022). Predicting soil textural classes using Random Forest models: learning from an imbalanced dataset. Agronomy, 12, Article 2613. https://doi.org/10.3390/agronomy12112613
Moreno-Maroto, J. M., & Alonso-Azcárate, J. (2022). Evaluation of the USDA soil texture triangle through Atterberg limits and an alternative classification system. Applied Clay Science, 229, Article 106689. https://doi.org/10.1016/j.clay.2022.106689
Moustakas, N. K. (2012). A study of Vertisol genesis in North Eastern Greece. Catena, 92, 208−215. https://doi.org/10.1016/j.catena.2011.12.011
Ofosu-Ampong, K. (2024). Artificial intelligence research: A review on dominant themes, methods, frameworks, and future research directions. Telematics and Informatics Reports, 14, Article 100127. https://doi.org/10.1016/j.teler.2024.100127
Omondiagbe, O. P., Lilburne, L., Licorish, S. A., & MacDonell, S. G. (2023). Soil texture prediction with automated deep convolutional neural networks and population-based learning. Geoderma, 436, Article 116521. https://doi.org/10.1016/j.geoderma.2023.116521
Orhan, U., & Kılınç, E. (2020). Estimating soil texture with laser-guided Bouyoucos. Automatika, 61(1), 1−10. https://doi.org/10.1080/00051144.2019.1654283
Pachepsky, Y., & Park, Y. (2015). Saturated hydraulic conductivity of US soils grouped according to textural class and bulk density. Soil Science Society of America Journal, 79(4), 1094−1100. https://doi.org/10.2136/sssaj2015.02.0067
Paterson, S., Minasny, B., & McBratney, A. (2018). Spatial variability of Australian soil texture: A multiscale analysis. Geoderma, 309, 60–74. https://doi.org/10.1016/j.geoderma.2017.09.005
Peel, M.C., Finlayson, B. L., & McMahon, T. A. (2007). Updated world map of the Köppen-Geiger climate classification. Hydrology and Earth System Sciences, 11, 1633–1644. https://doi.org/10.5194/hess-11-1633-2007
Pérez R. G., & Hidalgo M. A. (2016). Regionalización climática de la provincia de Holguín. Revista Cubana de Meteorología, 22(1), 39–48.
Pöhlitz, J., Rücknagel, J., Schlüter, S., Vogel, H.-J., & Christen, O. (2020). Estimation of critical stress ranges to preserve soil functions for differently textured soils. Soil & Tillage Research, 200, Article 104637. https://doi.org/10.1016/j.still.2020.104637
Saidi, S., Ayoubi, S., Shirvani, M., Azizi, K., & Zeraatpisheh, M. (2022). Comparison of different machine learning methods for predicting cation exchange capacity using environmental and remote sensing data. Sensors, 22(18), Article 6890. https://doi.org/10.3390/s22186890
Silvey, S., & Liu, J. (2024). Sample size requirements for popular classification algorithms in tabular clinical data: empirical study. Journal of Medical Internet Research, 26, Article e60231. https://doi.org/10.2196/60231
Spohn, M., & Stendahl, J. (2024). Soil carbon and nitrogen contents in forest soils are related to soil texture, pH, and metal cations. Geoderma, 441, Article 116746. https://doi.org/10.1016/j.geoderma.2023.116746
Suleymanov, A., Gabbasova, I., Komissarov, M., Suleymanov, R., Garipov, T., Tuktarova, I., & Belan, L. (2023). Random forest modeling of soil properties in saline semi-arid areas. Agriculture, 13(5), Article 976. https://doi.org/10.3390/agriculture13050976
Sullivan, W. (2018). Decision tree and random forest - Machine learning and algorithms: The future is here! Createspace Independent Publishing Platform.
Van der Maaten, L. (2014). Accelerating t-SNE using tree-based algorithms. The Journal of Machine Learning Research, 15(1), 3221−3245. https://dl.acm.org/doi/10.5555/2627435.2697068
Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(86), 2579−2605. https://jmlr.org/papers/v9/vandermaaten08a.html
Vos, C., Don, A., Prietz, R., Heidkamp, A., & Freibauer, A. (2016). Field-based soil-texture estimates could replace laboratory analysis. Geoderma, 267, 215−219. https://doi.org/10.1016/j.geoderma.2015.12.022
Wahyuningsih, T., Manongga, D., Sembiring, I., & Wijono, S. (2024). Comparison of the effectiveness of Logistic Regression, Naive Bayes, and Random Forest algorithms in predicting student arguments. Procedia Computer Science, 234, 349−356. https://doi.org/10.1016/j.procs.2024.03.014
Weindorf, D. C., & Chakraborty, S. (2024). Balancing machine learning and artificial intelligence in soil science with a human perspective and experience. Pedosphere, 34(1), 9−12. https://doi.org/10.1016/j.pedsph.2023.09.010
Wittstruck, L., Waske, B., & Jarmer, T. (2025). Multi-modal vision transformer for high-resolution soil texture prediction of German agricultural soils using remote sensing imagery. Remote Sensing of Environment, 331, Article 114985. https://doi.org/10.1016/j.rse.2025.114985
Wu, W., Li, A.-D., He, X.-H., Ma, R., Liu, H.-B., & Lv, J.-K. (2018). A comparison of support vector machines, artificial neural networks, and classification trees for identifying soil texture classes in southwest China. Computers and Electronics in Agriculture, 144, 86−93. https://doi.org/10.1016/j.compag.2017.11.037
How to Cite
APA
ACM
ACS
ABNT
Chicago
Harvard
IEEE
MLA
Turabian
Vancouver
Download Citation
License
Copyright (c) 2025 Agronomía Colombiana

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
© Centro Editorial de la Facultad de Ciencias Agrarias, Universidad Nacional de Colombia
Reproduction and quotation of material appearing in the journal is authorized provided the following are explicitly indicated: journal name, author(s) name, year, volume, issue and pages of the source. The ideas and observations recorded by the authors are their own and do not necessarily represent the views and policies of the Universidad Nacional de Colombia. Mention of products or commercial firms in the journal does not constitute a recommendation or endorsement on the part of the Universidad Nacional de Colombia; furthermore, the use of such products should comply with the product label recommendations.
The Creative Commons license used by Agronomia Colombiana journal is: Attribution - NonCommercial - ShareAlike (by-nc-sa)

Agronomia Colombiana by Centro Editorial of Facultad de Ciencias Agrarias, Universidad Nacional de Colombia is licensed under a Creative Commons Reconocimiento-NoComercial-CompartirIgual 4.0 Internacional License.
Creado a partir de la obra en http://revistas.unal.edu.co/index.php/agrocol/.







