Publicado

2014-05-01

UV-vis in situ spectrometry data mining through linear and non linear analysis methods

Minería de datos UV-vis in situ con métodos de análisis lineales y no lineales

DOI:

https://doi.org/10.15446/dyna.v81n185.37718

Palabras clave:

UV-visible spectrometer, water quality, multivariate data analysis, non-linear data analysis (en)
Espectrómetro UV-visible, calidad de agua, análisis multivariado, análisis de datos no lineal (es)

Autores/as

  • Liliana Lopez-Kleine Universidad Nacional de Colombia
  • Andrés Torres Pontificia Universidad Javeriana
UV-visible spectrometers are instruments that register the absorbance of emitted light by particles suspended in water for several wavelengths and deliver continuous measurements that can be interpreted as concentrations of parameters commonly used to evaluate physico-chemical status of water bodies. Classical parameters that indicate presence of pollutants are total suspended solids (TSS) and chemical demand of oxygen (CDO). Flexible and efficient methods to relate the instruments's multivariate registers and classical measurements are needed in order to extract useful information for management and monitoring. Analysis methods such as Partial Least Squares (PLS) are used in order to calibrate an instrument for a water matrix taking into account cross-sensitivity. Several authors have shown that it is necessary to undertake specific instrument calibrations for the studied hydro-system and explore linear and non-linear statistical methods for the UV-visible data analysis and its relationship with chemical and physical parameters. In this work we apply classical linear multivariate data analysis and non-linear kernel methods in order to mine UV-vis high dimensional data, which turn out to be useful for detecting relationships between UV-vis data and classical parameters and outliers, as well as revealing non-linear data structures.
Los espectrómetros UV-visibles son captores que registran la absorbancia de luz emitida por partículas suspendidas en el agua a diferentes longitudes de onda y proporcionan mediciones en continuo, las cuales pueden ser interpretadas como concentraciones de parámetros comúnmente usados para evaluar el estado físico-químico de cuerpos de agua. Parámetros clásicos usados para detectar la presencia de contaminación en el agua son los sólidos suspendidos totales (TSS) y la demanda química de oxígeno (CDO). Métodos de análisis flexibles y eficientes son necesarios para extraer información útil para fines de gestión y monitoreo a partir de los datos multivariados que proporcionan los captores. Se han usado métodos de calibración de tipo regresión parcial por mínimos cuadrados parciales (PLS). Varios autores han demostrado la necesidad de realizar la calibración para cada tipo de datos y cada cuerpo de agua, así como explorar métodos de análisis lineales y no lineales para el análisis de datos UV-visible y para determinar su relación con parámetros clásicos. En este trabajo se aplican métodos de análisis multivariado lineales y no lineales para la minería de datos UV-vis de alta dimensión, los cuales resultan útiles para la identificación de relaciones entre parámetros y longitudes de onda, la detección de muestras atípicas, así como la detección de estructuras no lineales en los datos.

Descargas

Los datos de descargas todavía no están disponibles.

Citas

Empresas Públicas de Medellín (EPM). Planta de tratamiento de aguas residuales San Fernando premio nacional de ingeniería año 2000. http://xue.unalmed.edu.co/mdrojas/evaluacion/PLANTA%20DE%20TRATAMIENTO%20DE%20AGUAS%20RESIDUALES%20SAN%20FERNADO.pdf, 2007 (accessed 10 January 2012)

Empresas Públicas de Medellín (EPM). EPM y su Programa de saneamiento del río Medellín. http://www.epm.com.co/docs-bid/aguas/Proyectos_Saneamiento_Rio_Medell%C3%ADn_Espa%C3%B1ol.pdf (accessed10January 2012), 2009.

Gamerith, V., High resolution online data in sewer water quality modelling. PhD thesis: Faculty of Civil Engineering, University of Technology Graz (Austria), May 2011, P. 236, þ annexes, 2011.

Gruber G., Bertrand-Krajewski J.-L., de Bénédittis J., Hochedlinger M. and Lettl, W., Practical aspects, experiences and strategies by using UV/VIS sensors for long-term sewer monitoring. Water Practice and Technology (paper doi10.2166/wpt.2006.020), 1(1), P. 8 ISSN 1751-231X, 2006.

Hochedlinger, M., Assessment of combined sewer overflow emissions. PhD thesis: Faculty of Civil Engineering, University of Technology Graz (Austria), June 2005, P.174 þ annexes, 2005.

Hofstaedter, F., Ertl, T., Langergraber, G., Lettl, W. and Weingartner, A., On-line nitrate monitoring in sewers using UV/VIS spectroscopy. In: Wanner, J., Sykora, V. (eds): Proceedings of the 5th International Conference of ACE CR "Odpadni vody – Wastewater 2003", 13–15 May 2003, Olomouc, Czech Republic, pp. 341–344, 2003.

Karatzoglou, A., Smola, A., Hornik, K. and Zeileis, A., kernlab - An S4 Package for Kernel Methods in R. Journal of Statistical Software 11(9), pp. 1-20. URL http://www.jstatsoft.org/v11/i09/, 2011.

Langergraber, G., Fleischmann, N., Hofstaedter, F. and Weingartner, A., Monitoring of a paper mill wastewater treatment plant using UV/VIS spectroscopy. Trends in Sustainable Production, 49(1), pp. 9–14, 2004.

Langergraber, G., Fleischmann, N. and Hofstaedter, F., A multivariate calibration procedure for UV/VIS spectrometric quantification of organic matter and nitrate in wastewater. Water science & technology, 47(2), pp. 63–71, 2003.

Lebart, L., Piron, M. and Morineau, A., Statistique exploratoire multimensionnnelle. Dunod, Paris, 1995.

MacQueen, J. B., Some Methods for classification and Analysis of Multivariate Observations. 1. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, 1967.

Moguerza, J.M. and Muñoz, A., Support Vector Machines with Applications. Statistical Science, 21, pp. 322-336, 2006.

Mullen, K., Ardia, D., Gil, D., Windover, D. and Cline, J., 'DEoptim': An R Package for Global Optimization by Differential Evolution. Journal of Statistical Software, 40 (6), 1-26. URL http://www.jstatsoft.org/v40/i06/, 2011.

Price, K.V., Storn, R.M. and Lampinen, J.A., Differential Evolution - A Practical Approach to Global Optimization. Berlin Heidelberg: Springer-Verlag. ISBN 3540209506, 2006.

R Development Core Team R, A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/, 2012.

Rieger, L., Vanrolleghem, P., Langergraber, G., Kaelin, D. and Siegrist, H., Long-term evaluation of a spectral sensor for nitrite and nitrate. Water Science and Technology. Vol. 57 (10). pp. 1563–1569, 2008.

Soto, S. and Jimenez, C., Aprendizaje supervisado para la discriminación y clasificación difusa. Dyna. Vol. 169. pp. 26-33, 2011.

Schölkopf, B., Smola A. J. Learning with kernels. The MIT Press, Cambridge, Massachusetts, 2002.

Torres, A. and Bertrand-Krajewski, J.L. Partial Least Squares local calibration of a UV-Visible spectrometer used for in situ measurements of COD and TSS concentrations in urban drainage systems. Water Science and Technology 57, pp. 581–588, 2008.

Winkler, S., Bertrand-Krajewski, J.-L., Torres, A. and Saracevic, E., Benefits, limitations and uncertainty of in situ spectrometry. Water science and technology: a journal of the International Association on Water Pollution Research, 57 (10), 1651, 2008.