Aprendizaje de selección de acciones en un mundo simple pero impredecible

Sergio A. Rojas; José J. Martínez

doi:10.15446/ing.investig.n49.21404

Published

2002-01-01

Aprendizaje de selección de acciones en un mundo simple pero impredecible

DOI:

https://doi.org/10.15446/ing.investig.n49.21404

Keywords:

Aprendizaje por refuerzo, Aprendizaje Q, Agentes autónomos, Animats (es)
Reinforcement learning, Q learning, Autonomous agents, Animats (en)

Downloads

PDF (Español)

Authors

Sergio A. Rojas Universidad Distrital Francisco José de Caldas
José J. Martínez Universidad Nacional de Colombia

Abstract (es)
Abstract (en)

Uno de los principales problemas estudiados en la simulación de agentes artificiales autónomos es el de la selección de acciones: un mecanismo que le permita al sistema escoger la acción más apropiada para la situación en que se encuentre, de tal forma que maximice su medida de éxito. El aprendizaje por refuerzo representa un enfoque atractivo para atacar este problema, ya que se basa en la búsqueda de señales de premio y la evasión de señales de castigo mediante un proceso de ensayo y error. En este artículo presentamos al PAISA I, una criatura artificial que aprende a comportarse (seleccionar acciones) utilizando una técnica de aprendizaje por refuerzo (aprendizaje Q) para optimizar la cantidad de comida que puede encontrar en un mundo impredecible, aunque con un espacio estado-acción pequeño.

One of the main problems studied in simulation of artificial autonomous agents is the action-selection: a mechanism that allows the system to choice the more suitable action for the specific situation where it is located, in such a way that maximizes his success measure. The reinforcement learning represents an attractive approach to attack this problem, because it is based in the searching of awards signals and the refusing of punishments by a trial and error process. In this paper, we present the PAISA I, an artificial creature that learns to behave (that is, action-selection) using a reinforcement learning technique known as Q-learning, to optimize the amount of food that he can find in an unpredictable world, although in a small state-action space.

References

[Baird, 1994] Baird, L. C. (1994). Reinforcement Learning in Continuous Time: Advantage Updating. Proceedings of the International Conference on Neural Networks. DOI: https://doi.org/10.1109/ICNN.1994.374604

[Lin, 1992] Lin, L. (1992). Self-improving reactive agents based on reinforcement learning, planning and teaching. En: Machine Learning, 8. DOI: https://doi.org/10.1007/978-1-4615-3618-5_5

[Moriarty et al., 1996] Moriarty, D. E.; Miikkulainen, R. (1996). Efficient reinforcement learning through symbiotic evolution, En: Machine Learning, 22. DOI: https://doi.org/10.1007/BF00114722

[Munos et al., 1994] Munos, R.; Patinel, J. (1994). Reinforcement learning with dynamic covering of state-action: partitioning Q-learning. En: Cliff, D.; Husbands, P; Meyer, J. A.; Wilson, S. W. (Eds), From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior. The MIT Press/Bradford Books.

[Peng, 1993] Peng, J. (1993). Efficient Dynamic Programming-based Learning for Control. Tesis doctoral. College of Computer Science of Northeastern University.

[Peng et al., 1996] Peng, J.; Williams, R. J. (1996). Incremental Multi-step Q-Learning. En: Machine Learning, 22. DOI: https://doi.org/10.1007/BF00114731

[Rojas, 1998] Rojas, S. A. (1998). Disertación teórica sobre simulaciones inspiradas biológicamente para el estudio del comportamiento adaptativo. Monografía de grado. Facultad de Ingeniería de la Universidad Nacional de Colombia.

[Sutton et al., 1998] Sutton, R. S.; Barto, A. G. (1998). Reinforcement Learning: An Introduction. The MIT Press. DOI: https://doi.org/10.1109/TNN.1998.712192

[Watkins et al., 1992] Watkins, C. J.; Dayan, P (1992). Q-Learning. En: Machine Learning, 8. DOI: https://doi.org/10.1023/A:1022676722315

How to Cite

APA

Rojas, S. A. and Martínez, J. J. (2002). Aprendizaje de selección de acciones en un mundo simple pero impredecible. Ingeniería e Investigación, (49), 9–13. https://doi.org/10.15446/ing.investig.n49.21404

ACM

[1]

Rojas, S.A. and Martínez, J.J. 2002. Aprendizaje de selección de acciones en un mundo simple pero impredecible. Ingeniería e Investigación. 49 (Jan. 2002), 9–13. DOI:https://doi.org/10.15446/ing.investig.n49.21404.

ACS

(1)

Rojas, S. A.; Martínez, J. J. Aprendizaje de selección de acciones en un mundo simple pero impredecible. Ing. Inv. 2002, 9-13.

ABNT

ROJAS, S. A.; MARTÍNEZ, J. J. Aprendizaje de selección de acciones en un mundo simple pero impredecible. Ingeniería e Investigación, [S. l.], n. 49, p. 9–13, 2002. DOI: 10.15446/ing.investig.n49.21404. Disponível em: https://revistas.unal.edu.co/index.php/ingeinv/article/view/21404. Acesso em: 31 jan. 2025.

Chicago

Rojas, Sergio A., and José J. Martínez. 2002. “Aprendizaje de selección de acciones en un mundo simple pero impredecible”. Ingeniería E Investigación, no. 49 (January):9-13. https://doi.org/10.15446/ing.investig.n49.21404.

Harvard

Rojas, S. A. and Martínez, J. J. (2002) “Aprendizaje de selección de acciones en un mundo simple pero impredecible”, Ingeniería e Investigación, (49), pp. 9–13. doi: 10.15446/ing.investig.n49.21404.

IEEE

[1]

S. A. Rojas and J. J. Martínez, “Aprendizaje de selección de acciones en un mundo simple pero impredecible”, Ing. Inv., no. 49, pp. 9–13, Jan. 2002.

MLA

Rojas, S. A., and J. J. Martínez. “Aprendizaje de selección de acciones en un mundo simple pero impredecible”. Ingeniería e Investigación, no. 49, Jan. 2002, pp. 9-13, doi:10.15446/ing.investig.n49.21404.

Turabian

Rojas, Sergio A., and José J. Martínez. “Aprendizaje de selección de acciones en un mundo simple pero impredecible”. Ingeniería e Investigación, no. 49 (January 1, 2002): 9–13. Accessed January 31, 2025. https://revistas.unal.edu.co/index.php/ingeinv/article/view/21404.

Vancouver

1.

Rojas SA, Martínez JJ. Aprendizaje de selección de acciones en un mundo simple pero impredecible. Ing. Inv. [Internet]. 2002 Jan. 1 [cited 2025 Jan. 31];(49):9-13. Available from: https://revistas.unal.edu.co/index.php/ingeinv/article/view/21404

Download Citation

CrossRef Cited-by

0

Dimensions

PlumX

Article abstract page views

359

Downloads

Download data is not yet available.

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

The authors or holders of the copyright for each article hereby confer exclusive, limited and free authorization on the Universidad Nacional de Colombia's journal Ingeniería e Investigación concerning the aforementioned article which, once it has been evaluated and approved, will be submitted for publication, in line with the following items:

1. The version which has been corrected according to the evaluators' suggestions will be remitted and it will be made clear whether the aforementioned article is an unedited document regarding which the rights to be authorized are held and total responsibility will be assumed by the authors for the content of the work being submitted to Ingeniería e Investigación, the Universidad Nacional de Colombia and third-parties;

2. The authorization conferred on the journal will come into force from the date on which it is included in the respective volume and issue of Ingeniería e Investigación in the Open Journal Systems and on the journal's main page (https://revistas.unal.edu.co/index.php/ingeinv), as well as in different databases and indices in which the publication is indexed;

3. The authors authorize the Universidad Nacional de Colombia's journal Ingeniería e Investigación to publish the document in whatever required format (printed, digital, electronic or whatsoever known or yet to be discovered form) and authorize Ingeniería e Investigación to include the work in any indices and/or search engines deemed necessary for promoting its diffusion;

4. The authors accept that such authorization is given free of charge and they, therefore, waive any right to receive remuneration from the publication, distribution, public communication and any use whatsoever referred to in the terms of this authorization.

	IBN Publindex El Índice Bibliográfico Nacional Publindex es un sistema colombiano para la clasificación, actualización, escalafonamiento y certificación de las publicaciones científicas y tecnológicas. Es regido por COLCIENCIAS y el ICFES en Colombia.
	Directory of Open Access Journals DOAJ aumenta la visibilidad y la facilidad de uso de las revistas científicas y académicas de acceso abierto, pretende ser global y abarcar todas las revistas que utilizan un sistema de control de calidad para garantizar el contenido.
	SciELO Colombia SciELO Colombia es una librería virtual para América Latina, el Caribe, España y Portugal, fue creada por FAPESP en el año de 1997 en Sao Pablo Brasil, actualmente en Colombia es gestionada por la Universidad Nacional de Colombia.
	REDIB Portal donde se muestran las revistas electrónicas españolas y latinoamericanas de acceso abierto (Open Access). Fue creado en España.
	Science Citation Index Expanded^TM SCI es un prestigio sistema de indexación en línea que incorpora información bibliográfica y de citación de publicaciones científicas alrededor del mundo.
	Scopus Scopus es una base de datos bibliográfica de resúmenes y citas de artículos de revistas científicas. Cubre aproximadamente 19.500 títulos de más de 5.000 editores internacionales, incluyendo la cobertura de de 16.500 revistas.
	Latindex Latindex es producto de la cooperación de una red de instituciones latinoamericanas que funcionan de manera coordinada para reunir y diseminar información bibliográfica sobre las publicaciones científicas seriadas producidas en la región.
	Dialnet Dialnet es un portal de difusión de la producción científica hispana que inició su funcionamiento en el año 2001 especializado en ciencias humanas y sociales. Su base de datos, de acceso libre, fue creada por la Universidad de La Rioja (España).
see more

Published

Aprendizaje de selección de acciones en un mundo simple pero impredecible

DOI:

Keywords:

Downloads

Authors

References

How to Cite

APA

ACM

ACS

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver

Download Citation

CrossRef Cited-by

Dimensions

PlumX

Article abstract page views

Downloads

License

Most read articles by the same author(s)

Make a Submission

Guide for authors

Peer-review process

Ethics

Scimago Journal & Country Rank - SJR

Keywords