Johnatan Bonilla

Humboldt-Universität zu Berlin

Institut für Romanistik

I am a research associate in project A09, which focuses on register and dialect variation in Canarian Spanish. My research interests include corpus linguistics, computational linguistics, and the intersection of sociolinguistics, dialectology, and natural language processing (NLP).

I hold a PhD in Linguistics from Ghent University and a Master’s in Linguistics from the Instituto Caro y Cuervo (Colombia). Previously, I led projects on dialectology, lexicography, and NLP, including the digitization of the Linguistic and Ethnographic Atlas of Colombia (ALEC), the development of the New Linguistic Atlas of Colombia (NALAC), and the creation of the first morphosyntactic treebank for spoken Spanish (COSER-UD).

Repositories

GitHub | HuggingFace | UD Spanish-COSER

Previous Projects

Recent Publications

  • Bonilla J.E., Merino Hernández, L. M., Marttinnen Larsson., M. (2025). BERT’s Interpretation of Literalmente ‘Literally’: What Deep Learning Models Can Tell Us about Synchronic Layering and Diachronic Shifts. Cognitive Semantics. https://doi.org/10.1163/23526416-bja10079
  • Monsalve Muñoz, U. de C., Bonilla, J.E., Rubio López, R.Y., Luna Cortés, A. S. (2025). LEXICC: The Design and Development of an Online Dictionary Writing System. Lexikos. https://doi.org/10.5788/35-1-1989
  • Bonilla, J. E. (2024). Spoken Spanish PoS tagging: Gold standard dataset. Language Resources and Evaluationhttps://doi.org/10.1007/s10579-024-09751-x
  • Fernández, J. O., Bonilla, J. E., & Rocha, L. Á. (2024). The influence of geographic variables in linguistic variation. Dialectologia. https://doi.org/10.1344/DIALECTOLOGIA2023.32.7
  • Bouzouita, M., Bonilla, J. E., & Segundo Díaz, R. L. (2024). Gaming for dialects: Creating an annotated and parsed corpus of European Spanish dialects through GWAPs. In Linguistic Corpora and Big Data in Spanish and Portuguese. https://doi.org/10.1515/9783110781465-005
  • Bonilla, J. E., Segundo Díaz, R. L., & Bouzouita, M. (2023). Using GWAPs for verifying PoS tagging of spoken dialectal Spanish. In Conference paperhttps://doi.org/10.1109/besc59560.2023.10386542
  • Bonilla, J. E. (2023). Superdialectos, dialectos y subdialectos del español de Colombia. Lexishttps://doi.org/10.18800/lexis.202302.002
  • Segundo Díaz, R. L., Bonilla, J. E., Bouzouita, M., & Rovelo Ruiz, G. (2023). Juegos con propósito para la anotación del Corpus Oral Sonoro del Español rural. Dialectologia et Geolinguistica. https://doi.org/10.1515/dialect-2023-0007

Projects

A09 On the interplay between register and socio-geographic variation in Canarian Spanish
MGK Integrated Graduate School

Contact

Hausvogteiplatz 5-7, R. 22 10117 Berlin

j.bonilla@hu-berlin.de https://orcid.org/0000-0002-8166-3548