Hyperparameter Tuning over an Attention Model for Image Captioning

Roberto Castro, Israel Pineda, Manuel Eugenio Morocho-Cayamcela

Producción científica: Capítulo del libro/informe/acta de congresoContribución a la conferenciarevisión exhaustiva

2 Citas (Scopus)

Resumen

Considering the historical trajectory and evolution of image captioning as a research area, this paper focuses on visual attention as an approach to solve captioning tasks with computer vision. This article studies the efficiency of different hyperparameter configurations on a state-of-the-art visual attention architecture composed of a pre-trained residual neural network encoder, and a long short-term memory decoder. Results show that the selection of both the cost function and the gradient-based optimizer have a significant impact on the captioning results. Our system considers the cross-entropy, Kullback-Leibler divergence, mean squared error, and the negative log-likelihood loss functions, as well as the adaptive momentum, AdamW, RMSprop, stochastic gradient descent, and Adadelta optimizers. Based on the performance metrics, a combination of cross-entropy with Adam is identified as the best alternative returning a Top-5 accuracy value of 73.092, and a BLEU-4 value of 0.201. Setting the cross-entropy as an independent variable, the first two optimization alternatives prove the best performance with a BLEU-4 metric value of 0.201. In terms of the inference loss, Adam outperforms AdamW with 3.413 over 3.418 and a Top-5 accuracy of 73.092 over 72.989.

Idioma originalInglés
Título de la publicación alojadaInformation and Communication Technologies - 9th Conference of Ecuador, TICEC 2021, Proceedings
EditoresJuan Pablo Salgado Guerrero, Janneth Chicaiza Espinosa, Mariela Cerrada Lozada, Santiago Berrezueta-Guzman
EditorialSpringer Science and Business Media Deutschland GmbH
Páginas172-183
Número de páginas12
ISBN (versión impresa)9783030899400
DOI
EstadoPublicada - 2021
Publicado de forma externa
Evento9th Conference on Information and Communication Technologies of Ecuador, TICEC 2021 - Virtual, Online
Duración: 24 nov. 202126 nov. 2021

Serie de la publicación

NombreCommunications in Computer and Information Science
Volumen1456 CCIS
ISSN (versión impresa)1865-0929
ISSN (versión digital)1865-0937

Conferencia

Conferencia9th Conference on Information and Communication Technologies of Ecuador, TICEC 2021
CiudadVirtual, Online
Período24/11/2126/11/21

Huella

Profundice en los temas de investigación de 'Hyperparameter Tuning over an Attention Model for Image Captioning'. En conjunto forman una huella única.

Citar esto