Resumen
Speech disorders in children can significantly hinder their social interaction, academic performance, and overall development. Early detection is crucial for effective intervention, yet current diagnostic methods often overlook the potential of video-based analysis. This study explores the application of Principal Component Analysis (PCA) and Autoencoder (AE) techniques to analyse video data, focussing on facial movements during speech to facilitate the clustering of children with distinctive speech features. Using a database of 60 video recordings, embeddings were generated and evaluated on the basis of their ability to reconstruct the original data, the results visualised through t-SNE. PCA demonstrated superior performance with a Mean Squared Error (MSE) of 0.0023 for 78 dimensions, while AE achieved its lowest MSE of 0.0093 with 15 dimensions. In particular, embeddings with lower MSE showed better clustering tendencies. This study highlights the potential of integrating video-based analysis into machine learning frameworks to improve the accuracy and depth of speech disorder diagnostics.
Idioma original | Inglés |
---|---|
Título de la publicación alojada | International Conference on Technological Innovation and AI Research, ICTIAIR 2025 |
Editorial | Institution of Engineering and Technology |
Páginas | 32-37 |
Número de páginas | 6 |
Volumen | 2025 |
Edición | 4 |
ISBN (versión digital) | 9781837243143, 9781837243150, 9781837243235 |
DOI | |
Estado | Publicada - 2025 |
Evento | 2025 International Conference on Technological Innovation and AI Research, ICTIAIR 2025 - Virtual, Online, Ecuador Duración: 19 mar. 2025 → 21 mar. 2025 |
Conferencia
Conferencia | 2025 International Conference on Technological Innovation and AI Research, ICTIAIR 2025 |
---|---|
País/Territorio | Ecuador |
Ciudad | Virtual, Online |
Período | 19/03/25 → 21/03/25 |