Abstract
What do expressions like “colgar los tenis” (kick the bucket) or “dejar con el ojo cuadrado” (knock someone's socks off) have in common? Both are verbal idioms, meaning combinations of words that include a verb, whose meaning can’t be literally deduced by analyzing each term separately. These expressions are part of Mexican Spanish and reflect cultural, social, and emotional aspects that often deviate from traditional grammatical rules. But what happens when we try to detect these expressions automatically in digital texts? In this article, we explore how to teach computers to recognize them, with the goal of helping machines better understand how we speak in everyday life.
References
Alves, D., Fischer, S. y Teich, E. (2025). Syntagmatic productivity of MWEs in scientific english, proceedings of the 21st workshop on multiword expressions. Association for Computational Linguistics, pp. 1-6, doi: 10.18653/v1/2025.mwe-1.1
Baldwin, T., Bannard, C., Tanaka, T. y Widdows, D. (2003). An empirical model of multiword expression decomposability (pp. 89-96). Association for Computational Linguistics.
Casares, J. (1992). Introducción a la lexicología moderna. Consejo Superior de Investigaciones Científicas.
Fakharian, S. y Cook, P. (2021). Contextualized embeddings encode monolingual and cross-lingual knowledge of idiomaticity. (pp. 23-32). Association for Computational Linguistics.
González, M. I. y Ortega, G. D. (2005). En torno a la variación de las unidades fraseológicas. En R. Almela, Ramón, E. y Wotjak, G. Fraseología contrastiva: con ejemplos del alemán, español, francés e italiano. Universidad de Murcia.
Gramley, S., Gramley, V. y Patzold, K. M. (1992). A survey of modern english. Routledge.
Lamiroy, B. (2005). Le problème central du figement est le semi-figement. Linx.
Manning, C.D., Raghavan, P. y Schutze, H. (2008). Introduction to information retrieval. Cambridge University Press.
Pinto, D. y Priego, B. (2020). Using automatic constructed thesauri instead of dictionaries in the verbal phraseological units validation task. Journal of Intelligent & Fuzzy Systems, 39(2), pp. 2061-2070. https://doi.org/10.3233/JIFS-179.
Priego, B. (2020). A new lexical resource for evaluating polarity in spanish verbal phrases. Computación y Sistemas, 24(2), pp. 725-732. doi: 10.13053/CyS-24-2-3409
Priego, B. y Pinto, D. (2023). Locuciones verbales del español mexicano: un análisis desde la lingüística computacional. Libros BUAP.
Ramisch, C., Walsh, A., Blanchard, T. y Taslimipoor, S. (2023). A survey of MWE identification experiments: the devil is in the details (pp. 106-120). Association for Computational Linguistics. doi: 10.18653/v1/2023.mwe-1.15
Sag, I. A., Baldwin, T., Bond, F., Copestake, A. y Flickinger, D. (2002). Multiword expressions: a pain in the neck for NLP. En A. Gelbukh (Ed.). Computational linguistics and intelligent text processing. CICLing 2002. Lecture notes in computer science. Volumen 2276. Springer. https://doi.org/10.1007/3-540-45715-1_1
Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59(236), 433-460. doi:10.1093/mind/LIX.236.433
Zuloaga, A. (1980). Introducción al estudio de las expresiones fijas. Verlag Peter Lang.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Copyright (c) 2025 Universidad Autónoma Metropolitana
